No more applications are being accepted for this job
- Candidate must have experience on Real Time Data Processing using Spark or similar distributed processing frameworks.
- Modern Data and It's Ecosystem knowledge is MUST (Trino, distributed processing, hive, object storage.)
- Good Knowledge on Kubernetes/Docker is MUST
- Strong Knowledge on Apache Spark Streaming (PySpark) is MUST
- Strong Knowledge on Apache Kafka is MUST
- Good Knowledge on SQL is MUST
- Good Knowledge on Python is MUST
- Good to have Java language.
- Good to have conceptual knowledge on Apache Airflow, Apache Atlas, Ranger, Postgres, Apache Presto or Trino, Superset
- Working knowledge Flink will be a huge advantage.