Site Reliability Engineer - Hyderabad / Secunderabad, Telangana - confidential

    confidential
    confidential Hyderabad / Secunderabad, Telangana

    5 days ago

    Full time
    Description

    Key Responsibilities


    • Observability Platform Implementation:

    • Design and maintain distributed tracing, metrics, and logging using OpenTelemetry, Prometheus, Loki, and Tempo.
    • Ensure complete instrumentation of .NET Core applications for end-to-end visibility. o Implement telemetry pipelines for application logs, performance metrics, and traces.
    • Monitoring & Alerting:
    • Develop and manage SLIs, SLOs, and error budgets.
    • Create actionable, noise-free alerts using Prometheus Alertmanager and Azure Monitor. o Monitor key infrastructure components, applications, and databases with a focus on reliability and performance.
      • Azure & Infrastructure Integration:
    • Integrate Azure services (App Services, VMs, Storage, etc.) with the observability stack. o Configure monitoring for MSSQL databases, including performance tuning metrics and health indicators. o Use Azure Monitor, Log Analytics, and custom exporters where necessary.
    • Automation & DevOps:
    • Automate observability configurations using Terraform, PowerShell, or other IaC tools.
    • Integrate telemetry validation and health checks into CI/CD pipelines.
    • Maintain observability as code for repeatable deployments and easy scaling.
    • Resilience & Reliability Engineering:
    • Conduct capacity planning to anticipate scaling needs based on usage patterns and growth.
    • Define and implement disaster recovery strategies for critical Azure-hosted services and databases.
    • Perform load and stress testing to identify performance bottlenecks and validate infrastructure limits.
    • Support release engineering by integrating observability checks and rollback strategies in CI/CD pipelines.
    • Apply chaos engineering practices in lower environments to uncover potential reliability risks proactively.
      • Collaboration & Documentation:
    • Partner with engineering teams to promote observability best practices in .NET Core development. o Create dashboards (Grafana preferred) and runbooks for system insights and incident response. o Document monitoring standards, troubleshooting guides, and onboarding materials.

    Required Skills and Experience

    • 4+ years of experience in SRE, DevOps, or infrastructure-focused roles.
    • Deep experience with .NET Core application observability using OpenTelemetry.
    • Proficiency with Prometheus, Loki, Tempo, and related observability tools.
    • Strong background in Azure infrastructure monitoring, including App Services and VMs.
    • Hands-on experience monitoring MSSQL databases (deadlocks, query performance, etc.).
      • Familiarity with Infrastructure as Code (Terraform, Bicep) and scripting (PowerShell, Bash).
    • Experience building and tuning alerts, dashboards, and metrics for production systems.

    Preferred Qualifications

    • Azure certifications (e.g., AZ-104, AZ-400).
    • Experience with Grafana, Azure Monitor, and Log Analytics integration.
    • Familiarity with distributed systems and microservice architectures.
    • Prior experience in high-availability, regulated, or customer-facing environments.

  • confidential Hyderabad / Secunderabad, Telangana Full time

    The SRE Manager at Tech Blocks India will lead the reliability engineering function ensuring infrastructure resiliency and optimal operational performance. · This hybrid role blends technical leadership with team mentorship and cross-functional coordination.Establish and lead the ...

  • confidential Hyderabad / Secunderabad, Telangana Full time

    This site reliability engineer position involves designing building and maintaining scalable infrastructure developing automation tools monitoring system performance participating in oncall rotations collaborating with development teams implementing best practices analyzing root ...

  • confidential Hyderabad / Secunderabad, Telangana Full time

    Site Reliability Engineer with 5-7 years experience supporting different applications and infrastructure in hybrid-cloud platforms. Expertise in Java (J2EE/Spring Boot) and .NET application support, incident management, root cause analysis, and client relationship management. · ...

  • confidential Hyderabad / Secunderabad, Telangana Full time

    We are looking for an intelligent, resourceful, and highly skilled Senior Site Reliability Engineer (SRE) to join our Platform Site Reliability Engineering (PSRE) team. This team plays a critical role in ensuring the stability, reliability, and availability of mission-critical pr ...

  • confidential Hyderabad / Secunderabad, Telangana Full time

    Lead incident management, monitoring and alerting processes to ensure timely detection and resolution of production issues. Ensure reliability availability and performance of systems by defining SLIs SLOs and SLAs. · ...

  • confidential Hyderabad / Secunderabad, Telangana Full time

    We are seeking a Senior Site Reliability Engineer to develop and manage infrastructure as code solutions using Terraform and Ansible. The ideal candidate will have deep expertise in AWS services such as EC2, S3, VPC, RDS, EKS, ECS, CloudFormation and more. · ...

  • confidential Hyderabad / Secunderabad, Telangana Full time

    We are looking for Senior Software Engineers who are eager to build in a fast-paced, startup environment inside a stable, profitable company. Our teams are solving complex problems that impact the speed and effectiveness of the life sciences industry. · ...

  • confidential Hyderabad / Secunderabad, Telangana Full time

    Contribute in adoption of DevOps architecture and design for various services. · ...

  • confidential Hyderabad / Secunderabad, Telangana Full time

    We are looking for a Site Reliability Engineer to design and implement security controls and practices in cloud environments. · ...

  • confidential Hyderabad / Secunderabad, Telangana Full time

    +Zenoti provides an all-in-one, cloud-based software solution for the beauty and wellness industry. Our solution allows users to seamlessly manage every aspect of the business in a comprehensive mobile solution: online appointment bookings POS CRM employee management inventory ma ...

  • confidential Hyderabad / Secunderabad, Telangana Full time

    We are looking for a Senior Site Reliability Engineer to join our team of Phenom. · ...

  • confidential Hyderabad / Secunderabad, Telangana Full time

    This is a job summary for a Lead Site Reliability Engineer. The role involves collaborating with development, operations and product teams to define reliability standards and best practices. · ...

  • confidential Hyderabad / Secunderabad, Telangana Full time

    We are looking for a Senior Site Reliability Engineer to join our team of Phenom. · Work on core product environment upgradations, production issues fixing and incident response. · Expert in Containerization tools like Docker – Build and Administration & Kubernetes. · ...

  • confidential Hyderabad / Secunderabad, Telangana Full time

    Zenoti provides an all-in-one cloud-based software solution for the beauty and wellness industry. Our recent accomplishments include surpassing a $1 billion unicorn valuation and being named Next Tech Titan by GeekWire. · ...

  • confidential Hyderabad / Secunderabad, Telangana Full time

    SRE new headcount to assist with day-to-day activities supporting ST Application services related to deployment and incident management Build actionable alerts/automation for preventing incidents detecting performance bottlenecks and identifying maintenance activities. · ...

  • confidential Hyderabad / Secunderabad, Telangana Full time

    We are seeking a Site Reliability Engineer (SRE) to join our team in India. · We require the ideal candidate to have 5-9 years of experience in managing production systems, · ensuring reliability and performance while collaborating with cross-functional teams to drive software en ...

  • confidential Hyderabad / Secunderabad, Telangana Full time

    Azure DevOps SRE who can implement and drive SRE discipline in the project. · ...

  • confidential Hyderabad / Secunderabad, Telangana Full time

    Deploy and manage Kubernetes clusters in Azure using IaC tools. Support reliability and observability improvements across SaaS services. Provide tooling and guidance for development teams on platform usage. · ...

  • confidential Hyderabad / Secunderabad, Telangana Full time

    +Collaborate with dev teams to support scalable, reliable SaaS services. · ...

  • confidential Hyderabad / Secunderabad, Telangana Full time

    We are seeking a Site Reliability Engineer (SRE) to join our team in India. · ...

Jobs
>
Site reliability engineer