Pyspark Dsa 4 to 8 Years Pan India - Bengaluru, Karnataka, India - Capgemini
Description
Job Description:
- Bachelor's or master's degree in computer science, Engineering, or related field
- 58 years of experience in data engineering and machine learning
- Extensive experience with Python programming, including Pandas, NumPy, and various libraries for data manipulation and analysis
- Handson experience with Spark architecture, including data frame operations, lazy evaluation, and UDFs
- Strong understanding of DevOps principles and experience with tools like Docker, Kubernetes, and Jenkins
- Familiarity with machine learning libraries such as Spark MLlib or Azure ML is a plus
- Previous experience in productoriented organizations or Tier 1 companies is preferred
Primary Skills:
- Utilize advanced SQL techniques such as joins with inline views and selfjoins for data manipulation and extraction
- Proficient in Python programming, including but not limited to variables, functions, loops, conditions, and various data structures
- Implement objectoriented programming concepts including polymorphism, abstract classes, and interfaces for code modularity and reusability
- Develop and maintain data pipelines using Spark architecture, understanding nodes, clusters, lazy evaluation, and DAG in Spark
- Perform data frame operations and userdefined functions (UDFs) for efficient data processing in Spark
- Integrate APIs for data retrieval and interaction with external systems
- Collaborate with crossfunctional teams to understand business requirements and translate them into technical solutions
Secondary Skills:
- Good Communication Skills
More jobs from Capgemini
-
Java Fullstack
Chennai, India - 2 weeks ago
-
Msd B2 Bangalore/mum/pune
Bengaluru, India - 1 week ago
-
Aix Administrator
Pune, India - 2 weeks ago
-
IoT Security
Pune, India - 6 days ago
-
Ugtm Sbu Anchor 9 to 12 Years Mumbai
Mumbai, India - 1 week ago
-
Senior Software Engineer
Bengaluru, India - 2 weeks ago