No more applications are being accepted for this job
- Work closely with data scientists, analysts, and stakeholders to understand data requirements and ensure the availability and reliability of data for analysis and reporting.
- Utilize the Snowflake data warehouse platform to design and optimize data models, schemas, and queries for efficient data storage and retrieval.
- Implement data processing tasks using Python programming language and relevant libraries (e.g., Pandas, NumPy) to perform data manipulation, cleansing, and aggregation.
- Develop and manage data workflows and scheduling tasks using Apache Airflow framework to automate and orchestrate data pipeline activities.
- Write SQL queries for data extraction, transformation, and analysis, as well as optimize SQL queries for performance and scalability.
- Utilize AWS cloud services such as S3, Redshift, Glue, and Lambda for data storage, processing, and serverless computing.
- Implement real-time data processing and streaming analytics using Apache Spark and Kafka for handling high-volume, high-velocity data streams.
- Containerize data applications and services using Docker for easy deployment, scaling, and management, and orchestrate containerized applications using Kubernetes for container scheduling, scaling, and automation.
- Utilize version control systems like Git for managing codebase, collaborating with team members, and ensuring code quality and reliability.
- Bachelor's degree in Computer Science, Engineering, or related field; Master's degree preferred.
- Proven experience as a Data Engineer or similar role, with a strong focus on data warehousing, ETL, and cloud technologies.
- Proficiency in Python programming language and experience with relevant libraries/frameworks for data engineering tasks.
- Strong SQL skills and experience with relational databases (e.g., PostgreSQL, MySQL) and NoSQL databases (e.g., MongoDB, Cassandra).
- Hands-on experience with the Snowflake data warehouse platform and expertise in designing and optimizing data models and queries.
- Familiarity with Apache Airflow framework for building and managing data workflows and scheduling tasks.
- Knowledge of AWS cloud services and experience in building data solutions using services like S3, Redshift, Glue, Lambda, etc.
- Experience with Apache Spark and Kafka for real-time data processing and streaming analytics.
- Understanding of microservices architecture principles and hands-on experience in designing and deploying microservices-based data applications.
- Proficiency in containerization technologies such as Docker and container orchestration platforms like Kubernetes.
- Familiarity with version control systems like Git and experience with code reviews, branching strategies, and continuous integration/continuous deployment (CI/CD) pipelines.
Data Engineer - Coimbatore, India - Digisailor
Description
Role - Data Engineer
Preferred Exp – 6 to 8 Years
Location - Remote or Hybrid (Coimbatore)
Position - Contract
We are seeking a talented Data Engineer to join our team and play a key role in building and maintaining client data infrastructure. Ideal candidates should have a strong background in data warehousing, ETL processes, cloud technologies, and Python programming.
Responsibilities
Qualifications