Site Reliability Engineer - Bangalore Urban, India - PhonePe

    Default job background
    Technology / Internet
    Description

    JOB DESCRIPTION: We are looking for engineers who are passionate about reliability, performance, and efficiency, and with experience in building tools, services, and automation to manage and improve production services.

    • Systems internals/security, Linux, Network, and Monitoring work to improve the reliability and performance of the next generation of distributed systems and containerized deployments .
    • Diagnose and troubleshoot complex distributed systems handling millions of queries per second Knowledge of Linux cloud services using kvm/qemu/lvm.
    • In-depth knowledge in Perl/GoLang/Python to automate tasks with minimal intervention.
    • Day-to-day work is heavily command-line driven, which requires a strong understanding of Linux.
    • Troubleshoot issues across the entire stack - hardware, software, application, and network
    • Knowledge in Database technologies, specifically in MySQL/NoSQL is good to have.
    • Participate in 24x7 on-call rotations.
    • Design build and maintain core infrastructure that enables Phonepe scaling to support hundreds of thousands of concurrent users
    • Actively take part in the Analysis and System improvement plan.
    • Drive performance testing, capacity planning and high availability practices.
    • Own implementations of new technologies while ensuring proper testing and documentation.
    • Proactively monitor/identify/solve issues which could have a potential impact to our Infrastructure.
    • Natural team player and also have a resourceful attitude.
    • Buddy new team members, and get them production ready