Lead Data Engineer - New Delhi, India - Evnek Technologies Pvt Ltd

    Default job background
    Description
    Job Description·8 years' experience in developing scalable Big Data applications or solutions on distributed platforms.

    • Able to partner with others in solving complex problems by taking a broad perspective to identify
    • innovative solutions.
    • Strong skills building positive relationships across Product and Engineering.
    • Able to influence and communicate effectively, both verbally and written, with team members and
    • business stakeholders
    • Able to quickly pick up new programming languages, technologies, and frameworks.
    • Experience working in Agile and Scrum development process.
    • Experience working in a fast-paced, results-oriented environment.
    • Experience in Amazon Web Services (AWS) mainly S3, Managed Airflow, EMR/ EC2, IAM etc.
    • Experience working with Data Warehousing tools, including SQL database, Presto, and Snowflake
    • Experience architecting data product in Streaming, Serverless and Microservices Architecture and
    • platform.
    • Experience working with Data platforms, including EMR, Airflow, Databricks (Data Engineering & Delta
    • Lake components, and Lakehouse Medallion architecture), etc.
    • Experience with creating/ configuring Jenkins pipeline for smooth CI/CD process for Managed Spark jobs,
    • build Docker images, etc.
    • Experience working with distributed technology tools, including Spark, Python, Scala
    • Working knowledge of Data warehousing, Data modelling, Governance and Data Architecture
    • Working knowledge of Reporting & Analytical tools such as Tableau, Quicksite etc.
    • Demonstrated experience in learning new technologies and skills.
    • Bachelor degree in Computer Science, Information Systems, Business, or other relevant subject area
    Requirements MandatoryExperience in Cloud Computing, e.g., AWS, GCP, Azure, etc.
    Airflow workflow scheduling tool for creating data pipelinesExperience in EMR/ EC2, Databricks, delta lake (or) spark streamingDWH tools incl.

    SQL database, Presto, and SnowflakeGitHub source control tool & experience with creating/ configuring Jenkins pipelineCriticalPerformance Turing – Optimize SQL, PySpark for performanceExperience in distributed technology tools, viz.

    SQL, Spark, Python, PysparkRequirementsSQL, Spark, Python, Pyspark,SQL, Spark, Python, Pyspark,EMR/ EC2, Databricks