Data Engineer-GCP - Pune, India - Pixeldust Technologies

    Pixeldust Technologies
    Pixeldust Technologies Pune, India

    2 weeks ago

    Default job background
    Technology / Internet
    Description

    Job Description:

    We are seeking a skilled and experienced GCP (Google Cloud Platform) Data Engineering Specialist to join our team. The ideal candidate should have 3+ years of relevant experience with expertise in BigQuery, Dataflow, Spark, and Pub/Sub. As a Data Engineering Specialist, you will be responsible for designing, developing, and maintaining data pipelines, data integration solutions, and ETL processes on the GCP platform to support our data-driven applications and analytics initiatives.

    Responsibilities:

    • Design, develop, and maintain data pipelines and ETL processes using BigQuery, Dataflow, Spark, and Pub/Sub on the GCP platform.
    • Collaborate with data scientists, data analysts, and other stakeholders to gather requirements and define data engineering solutions that meet business needs.
    • Optimize and troubleshoot data pipelines for performance, reliability, and scalability.
    • Ensure data quality and integrity by implementing data validation, cleansing, and transformation processes.
    • Monitor and manage data processing workflows, troubleshoot, and resolve data processing issues.
    • Develop and maintain documentation for data engineering processes, workflows, and best practices.
    • Stay updated with the latest advancements in GCP data engineering technologies and recommend and implement improvements to existing data engineering processes.
    • Strong GCP, Hadoop, Hive, Spark, Unix Shell scripting, and Python experience.
    • Exceptional troubleshooting skills in the following: GCP, Hadoop, Hive, Spark, Unix Shell scripting.
    • Construct and maintain ELT/ETL job processes sourcing from disparate systems throughout the enterprise and load into an enterprise data lake.
    • Effectively acquire and translate user requirements into technical specifications to develop automated data pipelines to satisfy business demand

    Skills and Qualifications

    • Bachelor's or Master's degree in Computer Science, Engineering, or a related field.
    • 3 years of relevant experience in data engineering with a strong focus on GCP technologies, including BigQuery, Dataflow, Spark, and Pub/Sub.
    • Hands-on experience in designing, developing, and optimizing data pipelines, ETL processes, and data integration solutions on GCP.
    • Proficiency in programming languages such as Python, or Scala, with experience in writing efficient and scalable code for data processing.
    • Strong understanding of data modeling, data warehousing, and data integration concepts.
    • Knowledge of data lake and data warehouse architectures, data governance, and data security best practices on GCP.
    • Excellent problem-solving skills with the ability to troubleshoot and resolve complex data engineering issues.
    • Strong communication and collaboration skills to work effectively with cross-functional teams and stakeholders.
    • GCP certifications such as Google Cloud Certified - Data Engineer, Google Cloud Certified - Professional Data Engineer, or related certifications are a plus.