Jobs
>
Bengaluru

    Site Reliability Engineer - bangalore, India - PhonePe

    PhonePe
    Default job background
    Description

    About PhonePe Group:

    PhonePe is India's leading digital payments company with 50 crore (500 Million) registered users and 3.7 crore (37 Million) merchants covering over 99% of the postal codes across India. On the back of its leadership in digital payments, PhonePe has expanded into financial services (Insurance, Mutual Funds, Stock Broking, and Lending) as well as adjacent tech-enabled businesses such as Pincode for hyperlocal shopping and Indus App Store which is India's first localized App Store. The PhonePe Group is a portfolio of businesses aligned with the company's vision to offer every Indian an equal opportunity to accelerate their progress by unlocking the flow of money and access to services.

    Culture

    At PhonePe, we take extra care to make sure you give your best at work, Everyday And creating the right environment for you is just one of the things we do. We empower people and trust them to do the right thing. Here, you own your work from start to finish, right from day one. Being enthusiastic about tech is a big part of being at PhonePe. If you like building technology that impacts millions, ideating with some of the best minds in the country and executing on your dreams with purpose and speed, join us

    Job Overview:

    As a Site Reliability Engineer (SRE) specializing in Data Platform OnPremise, you will play a critical role in deployment, ensuring the reliability, scalability, and performance of our Cloudera Data Platform (CDP) infrastructure. You will collaborate closely with cross-functional teams to design, implement, and maintain robust systems that support our data-driven initiatives. The ideal candidate will have a deep understanding of Cloudera Data Platform, strong troubleshooting skills, and a proactive mindset towards automation and optimization. You will play a pivotal role in ensuring the smooth functioning, operation, performance and security of large high density Cloudera-based infrastructure.

    Key Responsibilities:

  • Implementation of Cloudera Data Platform: Lead the implementation process of Cloudera Data Platform on-premises, including planning, installation, configuration, and integration with existing systems.
  • Infrastructure Management: Manage and maintain the Cloudera-based infrastructure, ensuring optimal performance, high availability, and scalability. This includes monitoring system health, troubleshooting issues, and performing routine maintenance tasks.
  • Data Security and Compliance: Implement and enforce security best practices to safeguard data integrity and confidentiality within the Cloudera environment. Ensure compliance with relevant regulations and standards (e.g., GDPR, HIPAA, DPR).
  • Performance Optimization: Continuously optimize the Cloudera infrastructure to enhance performance, efficiency, and cost-effectiveness. Identify and resolve bottlenecks, tune configurations, and implement best practices for resource utilization.
  • Capacity Planning: Monitor resource utilization trends and plan for future capacity needs. Proactively identify potential capacity constraints and propose solutions to address them.
  • Backup and Disaster Recovery: Implement robust backup and disaster recovery strategies to ensure data protection and business continuity. Test and maintain backup and recovery procedures regularly.
  • Patches & Upgrades: Routinely apply recommended patches and perform rolling upgrades of the platform in accordance with the advisory from Cloudera, InfoSec and Compliance.
  • Documentation and Knowledge Sharing: Create comprehensive documentation for configurations, processes, and procedures related to the Cloudera Data Platform. Share knowledge and best practices with team members to foster continuous learning and improvement.
  • Collaboration and Communication: Collaborate effectively with cross-functional teams including data engineers, developers, and IT operations personnel. Communicate project status, issues, and resolutions clearly and promptly.
  • Qualifications:

  • Bachelor's degree in Computer Science, Engineering, or related field.
  • Proficiency in Linux system administration, shell scripting, and networking concepts.
  • 5+ years of experience in managing Big Data infrastructure.
  • Strong understanding of distributed computing principles and experience with Hadoop ecosystem technologies (HDFS, MapReduce, YARN, Hive, Spark, etc.).
  • Hands-on experience with configuration management tools (e.g., Salt,Ansible, Puppet, Chef).
  • Strong scripting skills (e.g., Python, Bash) for automation and troubleshooting.
  • Experience with monitoring and logging solutions (e.g., Prometheus, Grafana, ELK stack).
  • Knowledge of networking principles and protocols (TCP/IP, UDP, DNS, DHCP, etc.).
  • Experience with managing *nix based machines and strong working knowledge of quintessential Unix programs and tools (e.g. Ubuntu, Fedora, Redhat, etc.)
  • Excellent communication skills and the ability to collaborate effectively with cross-functional teams.
  • Excellent analytical, problem-solving, and troubleshooting skills..
  • Proven ability to work well under pressure and manage multiple priorities simultaneously.
  • Good To Have:

  • Cloudera Certified Administrator (CCA) or Cloudera Certified Professional (CCP) certification preferred.
  • Minimum 5 years of experience in managing and administering medium/large hadoop based environments (> 100 machines), including Cloudera Data Platform (CDP) experience is highly desirable.
  • Familiarity with Open Data Lake components such as Ozone, Iceberg, Spark, Flink, etc.
  • Familiarity with containerization and orchestration technologies (e.g. Docker, Kubernetes, OpenShift) is a plus
  • PhonePe Full Time Employee Benefits (Not applicable for Intern or Contract Roles)

  • Insurance Benefits - Medical Insurance, Critical Illness Insurance, Accidental Insurance, Life Insurance
  • Wellness Program - Employee Assistance Program, Onsite Medical Center, Emergency Support System
  • Parental Support - Maternity Benefit, Paternity Benefit Program, Adoption Assistance Program, Day-care Support Program
  • Mobility Benefits - Relocation benefits, Transfer Support Policy, Travel Policy
  • Retirement Benefits - Employee PF Contribution, Flexible PF Contribution, Gratuity, NPS, Leave Encashment
  • Other Benefits - Higher Education Assistance, Car Lease, Salary Advance Policy


  • MoEngage Inc Bengaluru, Karnataka, India

    **About MoEngage** · Fortune 500 brands and Enterprises across 35 countries such as **Deutsche Telekom, Samsung, Ally Financial, Vodafone, and McAfee along with internet-first brands such as Flipkart, Ola, OYO, and Bigbasket u**se MoEngage to orchestrate their cross-channel campa ...


  • ExxonMobil bangalore, India

    About us · At ExxonMobil, our vision is to lead in energy innovations that advance modern living and a net-zero future. As one of the world's largest publicly traded energy and chemical companies, we are powered by a unique and diverse workforce fueled by the pride in what we d ...


  • ExxonMobil Corporation bangalore, India

    What you will do · • Knowledgeable and hands on practice in performing the below activities: · • Criticality Assessment · • Equipment Strategy development – Fleet and RCM(Reliability Centered Maintenance) based · • Data analysis techniques that can include: · • Reliability mod ...


  • Infogain Bengaluru, India

    You can send your applications on · This Job is available at multiply locations in India like Mumbai, Pune, Bangalore, Noida & Gurgaon. · Title: · "SRE developers responsible for Design and implementation details reviewed/approved by SRE / Reliability Engineer (Lead): A SRE/Relia ...


  • ViewSonic Bengaluru, India

    Job Requirements: · Bachelor's degree in Computer Science, Engineering, or a related field. · 1+ year of experience in a relevant role, such as Site Reliability Engineer, DevOps Engineer, or similar, is preferred but not mandatory. · Basic understanding of AWS solutions including ...


  • Cricbuzz bangalore, India

    Site Reliability Engineer · We are looking for a highly skilled and motivated Web Server Site Reliability Engineer to join our team. As a Web Server Site Reliability Engineer, you will be responsible for ensuring the reliability, scalability, and performance of our web server inf ...


  • Ensono Bengaluru, India

    About Role · Ensono is continuing its growth and building a cloud-native managed service offering for our clients. We are looking for energetic and skilled remote Site Reliability Engineers to join us on this exciting new journey. As a Site Reliability Engineer, you and your te ...


  • Integra Connect Bangalore Urban, India

    About IntegraConnect · Integra Connect delivers a comprehensive, integrated suite of cloud-based technologies and services that enable specialty groups to optimize clinical and financial performance as reimbursement shifts to value-based models. Connected by the IntegraCloud pla ...


  • MethodHub Bengaluru, India

    Database Reliability Engineer Location: Bangalore, Noida / Hybrid, work from the office 3 days in a week. 4 to 10 years of relevant experience is required. Looking for strong DB Reliability Engineering candidates (4-10 yrs band) Must have strong skillset on MySQL DBA + Linux OS + ...


  • Prudential Manpower Pvt Bangalore, India

    Notice Period : Immediate to 30 Days · Minimum Requirements : · - 4 years of experience as a Site Reliability Engineer. · Experience with one or more of the following : · C++, Java, Python, Go, Perl and/or Ruby etc. · - Experience with Unix/Linux operating systems internals an ...


  • Kunato bangalore, India

    Site Reliability Engineer (SRE) - Python/Golang · Job Description: · We are seeking a highly skilled and passionate Site Reliability Engineer (SRE) to join our technology team. The ideal candidate will possess strong programming skills with expertise in Python, Golang, or bot ...


  • Prudential Manpower Pvt Bangalore, India

    Position : Site Reliability Engineer · Location : Bangalore · Notice Period : Immediate to 30 Days · Minimum Requirements : · - 4 years of experience as a Site Reliability Engineer. · - Experience with one or more of the following : C++, Java, Python, Go, Perl and/or Ruby etc. ...


  • Central Business Solutions Inc. Bengaluru, India

    The Enterprise Computing (EC) Core Infrastructure Services organization is looking for a Site Reliability Engineering to manage the operations, reliability and services for Morgan Stanley's suite of Software Distribution product ecosystem products that are part of Artifact Curati ...


  • Talent500 Bangalore, India

    Job Description : · Cloud Engineer - Site Reliability Engineering for Ford Credit Tech · Were passionate about building software that solves problems. We count on our Site Reliability Engineers (SREs) to empower our users with a rich feature set, high availability, and stellar ...


  • h3 Technologies, LLC Bengaluru, India

    Hi · We are looking for Site Reliablity Engineer (GCP) in Bangalore for one of our reputed client. If you or someone whom you might know is interested then please share resume to - · JD · Site Reliability Engineering (SRE) combines software and systems engineering to build and ru ...


  • ViewSonic Bengaluru, India

    Job Requirements: · Bachelor's degree in computer science, Engineering, or a related field. · 3+ years of experience as a Site Reliability Engineer, DevOps Engineer, or similar role. · Proficient in AWS solutions including but not limited to EC2, S3, CloudWatch, Lambda, and RDS. ...


  • CodeVyasa Bengaluru, India

    Looking for Site Reliability Engineer | Bangalore to join a team of rockstar developers. The candidate should have a minimum of 2.5+ yrs. of experience. · There are multiple openings. If you're looking for career growth & a chance to work with the top 0.1% of developers in the in ...


  • Renuza Technologies Private Limited Bangalore, India Full time

    Job Description : Site Reliability Engineers (SREs) are responsible for ensuring the reliability and performance of production systems at Renuza Technologies. · They wear many hats, encompassing troubleshooting, software development, system administration, infrastructure manageme ...


  • Zyoin group Bangalore, India permanent

    Responsibilities : · - Work with the Kubernetes, and Service Mesh team to manage our growing fleet of clusters globally, across multiple Cloud providers. · - Work with the Development Tools and Service Mesh teams to implement and measure SLAs, SLOs, and MTTD/R for our services fa ...


  • Andor Tech Bengaluru, India

    Role: · Site Reliability Engineer · Exp: · 5 to 7 yrs · Skills Required: · Primary Skill : Linux · Preferred: · Python & Mysql · Preferred Qualifications: · 5+ years in python programming, specifically for systems automation. · 1+ years of experience with Distributed data systems ...