No more applications are being accepted for this job
- Spark Developer with experience in large scale production deployment, data pipeline and large scale computation.
- Experience in using programming languages such as Scala, Python (PySpark) to mine and query data for analysis and sometimes use big data SQL engines.
- Develop data set processes for data modeling, mining and production.
- Create Scala/Spark jobs for data transformation and aggregation.
- Working experience with AWS services such as EMR, Athena, Glue, Redshift and Lambda.
- Experience in developing and deploying Databricks jobs with knowledge in optimizing cluster
- Responsible for the maintenance, improvement, cleaning, and manipulation of data in the
- Defines and builds the data pipelines that will enable faster, better, data-informed decision-making within the business.
- Designing and developing scalable ETL packages from the business source systems and the development of ETL routines in order to populate databases from sources and also to create aggregates.
- Experience with NoSQL databases, such as HBase, Cassandra, MongoDB.
- Experience with Apache Spark streaming and batch framework.
- Experience with Kafka, Storm, Zookeeper.
- Spark query tuning and performance optimization.
- Produce unit tests for Spark transformations and helper methods.
- Recommend ways to improve data reliability, efficiency and quality.
Data Engineer - Pune, India - Ascentt
Description
We are looking for Data Engineer with 3-7+ years of experience in production environment.
Must Have
configurations.
business operational and analytics databases.
Nice to Have