Data Engineer - Pune, India - Coditas
Description
We are looking for data engineers who have the right attitude, aptitude, skills, empathy, compassion, and hunger for learning. Build products in the data analytics space. A passion for shipping high-quality data products, interest in the data products space; curiosity about the bigger picture of building a company, product development and its people.
Roles and Responsibilities
● Develop and manage robust ETL pipelines using Apache Spark (Scala)
● Understand Spark concepts, performance optimization techniques and governance tools
● Develop a highly scalable, reliable, and high-performance data processing pipeline to extract, transform and load data from various systems to the Enterprise Data Warehouse/Data Lake/Data Mesh
● Collaborate cross-functionally to design effective data solutions
● Implement data workflows utilizing AWS Step Functions for efficient orchestration. Leverage AWS Glue and Crawler for seamless data cataloging and automation.
● Monitor, troubleshoot, and optimize pipeline performance and data quality
● Maintain high coding standards and produce thorough documentation. Contribute to high-level (HLD) and low-level (LLD) design discussions.
Technical Skills
● Minimum 3 years of progressive experience building solutions in Big Data environments.
● Have a strong ability to build robust and resilient data pipelines which are scalable, fault tolerant and reliable in terms of data movement.
● 3+ years of hands-on expertise in Python, Spark and Kafka.
● Strong command of AWS services like EMR, Redshift, Step Functions, AWS Glue, and AWS Crawler.
● Strong hands-on capabilities on SQL and NoSQL technologies.
● Sound understanding of data warehousing, modeling, and ETL concepts
● Familiarity with High-Level Design (HLD) and Low-Level Design (LLD) principles
● Excellent written and verbal communication skills.