Pyspark Dsa 4 to 8 Years Pan India - Bengaluru, Karnataka, India - Capgemini

Capgemini
Capgemini
Verified Company
Bengaluru, Karnataka, India

2 weeks ago

Deepika Kaur

Posted by:

Deepika Kaur

beBee Recuiter


Description

Job Description:


  • Bachelor's or master's degree in computer science, Engineering, or related field
  • 58 years of experience in data engineering and machine learning
  • Extensive experience with Python programming, including Pandas, NumPy, and various libraries for data manipulation and analysis
  • Handson experience with Spark architecture, including data frame operations, lazy evaluation, and UDFs
  • Strong understanding of DevOps principles and experience with tools like Docker, Kubernetes, and Jenkins
  • Familiarity with machine learning libraries such as Spark MLlib or Azure ML is a plus
  • Previous experience in productoriented organizations or Tier 1 companies is preferred

Primary Skills:


  • Utilize advanced SQL techniques such as joins with inline views and selfjoins for data manipulation and extraction
  • Proficient in Python programming, including but not limited to variables, functions, loops, conditions, and various data structures
  • Implement objectoriented programming concepts including polymorphism, abstract classes, and interfaces for code modularity and reusability
  • Develop and maintain data pipelines using Spark architecture, understanding nodes, clusters, lazy evaluation, and DAG in Spark
  • Perform data frame operations and userdefined functions (UDFs) for efficient data processing in Spark
  • Integrate APIs for data retrieval and interaction with external systems
  • Collaborate with crossfunctional teams to understand business requirements and translate them into technical solutions

Secondary Skills:


  • Good Communication Skills

More jobs from Capgemini