Jobs
>
Delhi

    Site Reliability Engineer/Architect - Bangalore/Any Location, India - Grizmo Labs

    Grizmo Labs
    Grizmo Labs Bangalore/Any Location, India

    Found in: Talent IN 2A C2 - 2 days ago

    Default job background
    permanent Technology / Internet
    Description

    Responsibilities :

    • Own the Infrastructure, and APM and work with Developers and Systems engineers to Build, Release, Monitor, and run the reliability of the service exceeding the agreed SLAs.
    • Write software to automate APIdriven tasks at scale and contribute to the product codebase in Java, JS, React, Node, Go, and Python.
    • Write automation to reduce toil and eliminate manual, repeatable tasks.
    • Work with Ansible, Puppet, Chef, Terraform, or another config management/orchestration suite, know where it's broken, work toward fixing them, and explore new alternatives.
    • Define and accelerate the implementation of support processes, tools, and best practices Maintain services once they are live by measuring and monitoring availability, latency, and overall system reliability.
    • Handle crossteam performance issues from identification of the cause, to determining the areas of improvement and driving those actions to closure.
    • Performance and maturity baselining of Systems, tools maturity, coverage, metrics, technology, and engineering practices.
    • Define, Measure, and Improve Reliability Metrics (SLO/SLI), Observability (Monitoring, Logging-Tracing solutions), Ops process (Incident, Problem Mgmt) and streamline automate release management.
    • Build dashboards to provide visibility into the performance of the applications.
    • Create chaos in the production environment purposefully in a controlled manager to validate the reliability of systems.
    • Mentor and coach other SREs in the organization.
    • Provide written and verbal updates to executives and the stakeholders of the application in the organization.
    • Understand the current process, and system setup and propose the improvements needed in the processes, and technology so that the application exceeds the desired Service Level Objective.
    • Troubleshoot, debug and diagnose operational issues and drive them to closure.
    • Understanding of software delivery life cycles, particularly Agile/Lean, and DevOps.

    Requirements :

    • A strong believer in automation to bring in sustained continuous improvement by automating Toil, and Runbooks, improving the ability of the applications to autoheal leading to improved reliability.
    • 15+ years of experience in the Development and Operations of applications/services in production that have uptime over 99.9%.
    • 8+ years of experience as a SRE in handling webscale applications.
    • Strong handson coding experience in one or more programming languages such as Python, Golang, Java, Bash, etc.
    • Good understanding of Observability (monitoring, logging, tracing, metrics) and chaos engineering concepts.
    • Proficiency in using Observability tools (for example : New Relic, Datadog, etc) for monitoring, logging, and tracing.
    • Expert level handson knowledge in public cloud platform AWS and/or Google Cloud Platform.
    • A professionallevel certificate in one of the public clouds is highly desirable.
    • Must have handson experience in using configuration management systems such as Ansible or SaltStack and infrastructure automation tools like Terraform or CloudFormation.
    • Should have used altering systems such as Pager Duty.
    • Should have implemented solutions around Service Level Indicators (SLIs) and Service Level Objectives (SLOs) for services.
    • Measurement should have been within a system and across systems in distributed systems.
    • Should have supported Production Incidents (PIs) on critical applications of a company.
    • Proven experience in handling largescale and growing infrastructure across Data Centers and heterogeneous Cloud platforms.
    • Experience as a service owner in managing large geographically diverse stakeholders.
    • Ability to work with creative fastgrowing engineering teams and motivate them to deliver their best work.
    • History of driving innovation.
    )

  • SkySys

    Site Reliability Engineer

    Found in: Talent IN C2 - 1 day ago


    SkySys Delhi Division, India

    Role: Site Reliability Engineer (SRE) · Position Type: Full-Time Contract (40hrs/week) · Contract Duration: Long Term · Work Time zone: IST · Work Schedule: 8 hours/day (Mon-Fri) · Location: 100% remote (candidate can work from anywhere in India) · Must haves: Monitoring and dep ...

  • MAYNOR CONSULTING

    Site Reliability Engineer

    Found in: Talent IN 2A C2 - 2 days ago


    MAYNOR CONSULTING Any Location/Hyderabad, India permanent

    On-Call Responsibility : · You will be point of contact for alerts and incidents and responsible for overall system reliability and availability · - Help maintain mission critical services. · - Maintain services once they are live by measuring and monitoring availability, latenc ...

  • Esri

    Software Reliability Engineer II

    Found in: Talent IN C2 - 5 days ago


    Esri New Delhi, India

    Overview · Esri is the world leader in geographic information systems (GIS) and developer of ArcGIS, the leading mapping and analytics software used in 75 percent of Fortune companies. At Esri, we believe in helping our customers take on challenging geospatial problems and makin ...

  • ARR Recruitment Solutions

    Senior Site Reliability Engineer

    Found in: Talent IN 2A C2 - 2 days ago


    ARR Recruitment Solutions Any Location/Bangalore, India permanent

    Educational Qualification : & Experience : · Experience Level (Years): 6 - 8 · Primary Skill : CI/CD · Relevant Years of Experience for Primary Skills : 4+ · Secondary Skill : Python · Relevant Years of Experience for Secondary Skills : 3+ · Job Description : · - 5+ years of expe ...

  • TravelKhana

    Architects/Engineers - Mobility

    Found in: beBee S2 IN - 3 days ago


    TravelKhana Delhi, India Full time

    Apply for Architects/Engineers Mobility, Career Progress Consultants in Delhi ,Delhi/ NCR for Year of Experience on ...

  • FINCENT SOFTWARE SERVICES PRIVATE LIMITED

    Fincent - Site Reliability Engineer - Cloud Services

    Found in: Talent IN 2A C2 - 4 days ago


    FINCENT SOFTWARE SERVICES PRIVATE LIMITED Any Location, India permanent

    Responsibilities : · Help to eliminate operational toil - seek to automate repetitive operations work. · Work with product development teams to ensure that our new features are able to meet SLAs. · Help mature the delivery process for teams; defining/managing automated deployment ...

  • Daxko

    Site Reliability Engineer

    Found in: Talent IN 2A C2 - 2 days ago


    Daxko Noida, India

    Company Description · Daxko powers health & wellness throughout the world. Every day our team members focus their passion and expertise in helping health & wellness facilities operate efficiently and engage their members. · Whether a neighborhood yoga studio, a national franchise ...

  • Idemia

    Site Reliability Engineer

    Found in: Talent IN C2 - 2 days ago


    Idemia Noida, India

    You may not know our name, but you have surely used our innovations and solutions. · Our mission is to unlock the world and make it safer through cutting-edge identity technologies. Every day, around the globe, we are enabling citizens and consumers alike to perform their daily ...

  • TravelKhana

    CTO/Architects/Engineers - Server Side

    Found in: beBee S2 IN - 3 days ago


    TravelKhana Delhi, India Full time

    Apply for CTO/Architects/Engineers Server Side, Career Progress Consultants in Delhi ,Delhi/ NCR for Year of Experience on ...

  • Coforge

    Lead Site Reliability Engineer

    Found in: Talent IN 2A C2 - 4 days ago


    Coforge Noida, India

    Description: · thought leader in the SRE space to help design a strategy and roadmap to help us mature as an organization · and translate business requirements to technical requirements, solution designing with commercial viability, and build business cases. · the sales team on s ...

  • Microsoft

    Site Reliability Engineering IC2

    Found in: Talent IN C2 - 4 days ago


    Microsoft Noida, India Full time

    Overview · Site Reliability Engineering - 1 · Job Summary · Do you want to work on a product that is used by millions of people around the world daily, and growing rapidly? Do you care deeply about how software is designed with a focus on supporting global scale? Do you want to ...

  • Global Payments

    Senior Site Reliability Engineer

    Found in: Talent IN C2 - 1 day ago


    Global Payments Noida, India Full time

    Every day, Global Payments makes it possible for millions of people to move money between buyers and sellers using our payments solutions for credit, debit, prepaid and merchant services. Our worldwide team helps over 3 million companies, more than 1,300 financial institutions an ...

  • TSYS

    Senior Site Reliability Engineer

    Found in: Talent IN C2 - 1 day ago


    TSYS Noida, India Full time

    Every day, Global Payments makes it possible for millions of people to move money between buyers and sellers using our payments solutions for credit, debit, prepaid and merchant services. Our worldwide team helps over 3 million companies, more than 1,300 financial institutions an ...

  • Oracle

    Principal Site Reliability Engineer

    Found in: Talent IN C2 - 22 hours ago


    Oracle Noida, India Regular Employee

    Job description · We are looking for dynamic and forward-looking engineers to join our database cloud engineering team. Candidate must have Oracle Database Administration experience as a Site Reliability Engineer or DBA on large production environments. Understand the end-to-end ...

  • Microsoft

    Site Reliability Engineering IC3

    Found in: Talent IN C2 - 3 days ago


    Microsoft Noida, India Full time

    Overview · Site Reliability Engineer 2- WEST · Job Summary · Do you want to work on a product that is used by millions of people around the world daily, and growing rapidly? Do you care deeply about how software is designed with a focus on supporting global scale? Do you want to ...

  • TSYS

    Senior Site Reliability Engineer

    Found in: Talent IN C2 - 5 days ago


    TSYS Noida, India Full time

    Every day, Global Payments makes it possible for millions of people to move money between buyers and sellers using our payments solutions for credit, debit, prepaid and merchant services. Our worldwide team helps over 3 million companies, more than 1,300 financial institutions an ...

  • Microsoft

    Senior Site Reliability Engineer

    Found in: Talent IN C2 - 2 days ago


    Microsoft Noida, India Full time

    Overview · Are you passionate about building and maintaining the world's computer? Do you want to work on the cutting-edge of cloud technology and solve challenging problems at hyperscale? If so, join us as a Site Reliability Engineer (SRE) in the Microsoft Azure Networking team ...

  • Microsoft

    Site Reliability Engineer II

    Found in: Talent IN C2 - 2 days ago


    Microsoft Noida, India Full time

    Overview · Are you looking to make a real difference in Microsoft's mission to empower every person and organization to achieve more, with the power of cloud computing? Are you passionate about driving reliability of the services to make customers' mission critical workloads run ...

  • Adobe

    Database Reliability Engineer 3

    Found in: Talent IN C2 - 2 days ago


    Adobe Noida, India Full time

    Our Company · Changing the world through digital experiences is what Adobe's all about. We give everyone—from emerging artists to global brands—everything they need to design and deliver exceptional digital experiences We're passionate about empowering people to create beautiful ...

  • Airtel Digital

    Site Reliability Engineer

    Found in: Talent IN 2A C2 - 2 days ago


    Airtel Digital Gurugram, India

    Site Reliability Engineer is one of the critical role in the technology team and the person working in this team will be responsible for application performance, availability, reliability and system uptime. Candidate is responsible to provide consultation and strategic recommenda ...