- Ensure high availability, performance, and reliability of CPaaS production systems speread across mutiple locations hosted over cloud and data centers
- Own and improve SLIs, SLOs, and SLAs for messaging platforms and supporting services.
- Monitor system health, latency, TPS, error rates, and delivery metrics using observability tools.
- Participate in on-call rotations and handle production incidents with a focus on fast recovery and root cause analysis.
- Deploy, configure, and optimize for high-throughput messaging (multiple channels)
- Troubleshoot telecom-specific issues including DLR failures, encoding problems, TPS drops and routing issues.
- Work directly with multiple teams for integrations, testing, and incident resolution.
- Perform packet-level analysis using tcpdump and Wireshark to diagnose network and protocol-level issues.
- Write and maintain shell scripts and automation to eliminate repetitive operational tasks and reduce human intervention.
- Contribute to infrastructure automation using tools like Ansible and CI/CD pipelines where applicable.
- Improve deployment, configuration, and rollback processes for messaging services.
- Design and enhance monitoring, alerting, and dashboards using tools such as Datadog, Site24x7, ELK and Grafana.
- Administer and troubleshoot Linux based servers in production environments.
- Manage and optimize MySQL and MongoDB databases including performance tuning, backups, and recovery.
- Works on API's and webhooks across the product & services. Its enhancements and troubleshooting.
- Maintain web and application servers such as Apache, Nginx, and jboss (WildFly)
- Support cloud-based and virtualized environments with exposure to auto-scaling and containerization concepts.
- Collaborate with engineering teams on release planning, production deployments, and post-release validation.
- Lead or contribute to incident response & RCA focusing on long-term reliability improvements.
- Track issues, changes, and reliability work using Jira and related tools.
- B.Tech / B.E in Computer Science or related field with 2–3 years of experience in SRE, DevOps, telecom, or CPaaS operations.
- Hands-on experience with SMS gateways and messaging workflows.
- Solid understanding of Linux systems, networking fundamentals, and production troubleshooting.
- Strong experience with MySQL & MongoDB administration, queries, and performance optimization.
- Proficiency in shell scripting and a mindset toward automation and reliability engineering.
- Hands-on experience with tcpdump, Wireshark, and protocol-level troubleshooting.
- Experience with monitoring, logging, and alerting systems (Datadog, ELK, Grafana, Site24x7, etc.).
- Familiarity with configuration management tools like Ansible and version control systems (Git).
- Working knowledge of cloud platforms, virtualization, auto-scaling, and containerization.
- Strong incident management, analytical thinking, and communication skills.
- Certifications such as RHCE, AWS, or SRE-related credentials are a plus
-
We are looking for experienced SREs who can deliver insights into system bottlenecks and ensure system reliability and scalability. · ...
Gurugram.India1 week ago
-
Tower Research Capital is a leading quantitative trading firm founded in 1998. Tower has built its business on a high-performance platform and independent trading teams. We have a 25+ year track record of innovation and a reputation for discovering unique market opportunities. · ...
Gurgaon, Haryana1 month ago
-
We are looking for an experienced Reliability and Maintainability (RAM) engineer proficient in doing reliability calculations. · Graduate in reliability engineering or Mechanical/Electrical graduate with a certification in Reliability engineering. · 5 years of experience required ...
Gurugram, Haryana, India1 week ago
-
We are looking for a seasoned DevOps Engineer with a strong background in Kubernetes (k8s), AWS Cloud, and infrastructure automation to join our growing engineering team.Manage and optimize multi-region Kubernetes clusters across multiple cloud platforms e.g. AWS, Azure to ensure ...
Gurgaon, Haryana, India3 days ago
-
Graviton is a privately funded quantitative trading firm striving for excellence in financial markets research. We are seeking a skilled Application Reliability Engineer to be the first line of defense for ensuring the reliability, · availability,and performance of our databases, ...
Gurgaon, Haryana2 weeks ago
-
We are seeking an experienced Site Reliability Engineer to ensure the stability, scalability, and performance of our Enterprise Agentic AI platform. · ...
Gurgaon, Haryana, India1 week ago
-
We're looking for a Site Reliability Engineer to help design and operate large-scale, distributed technology systems that power our identity applications. · Key Responsibilities · Reliability Engineering · Define and measure SLOs SLIs and error budgets for key services. · ...
Gurugram, Hyderabad2 weeks ago
-
We are seeking a skilled Application Reliability Engineer to ensure the reliability, availability, and performance of our databases, services, and trading support systems. · Monitor production services and respond quickly to alerts. · Triage issues across trading support services ...
Gurugram1 month ago
-
Join as a Site Reliability Engineer in an inclusive team with collaborative ethos and commitment to innovation and professional development. · ...
Gurugram, India1 month ago
-
In this role you will play a crucial part in shaping the firm's infrastructure reliability and efficiency by implementing robust Site Reliability Engineering practices. · ...
Gurugram, Panchkula1 month ago
-
We are looking for a skilled Site Reliability Engineer (SRE) to join our custom software engineering team. · ...
Gurugram4 weeks ago
-
We are looking for a seasoned DevOps Engineer with a strong background in Kubernetes (k8s), AWS Cloud, and infrastructure automation to join our growing engineering team. · ...
Gurugram1 month ago
-
This role involves implementing Site Reliability Engineering practices to ensure infrastructure reliability and efficiency. The candidate will work towards enhancing the reliability of systems and minimizing downtime. Key responsibilities include ensuring uptime and stability of ...
Gurugram, Haryana1 month ago
-
Job summary · SITE RELIABILITY ENGINEER Job Description · As a Site Reliability Engineer you will play a key role in ensuring our systems remain reliable available and performant for both our customers and internal teams Your expertise will directly impact our users experience an ...
Gurgaon, Haryana3 weeks ago
-
+Join us as a Site Reliability Engineer in Gurugram. In this key role, you'll support improvement of non-functional characteristics like availability and performance. · +You'll work alongside colleagues to meet defined service level objectives. · You'll contribute new ideas and i ...
Gurugram1 month ago
-
The Site Reliability Engineering (SRE) team is responsible for ensuring the reliability, · scalability, and performance of large-scale telecom and CPaaS platforms.This role combines software engineering and systems operations to build resilient, · observable, and automated infras ...
Gurugram5 days ago
-
We are seeking a talented and motivated Senior Site Reliability Engineer (SRE) to join our organization. We are responsible for monitoring the stability and availability of mission-critical production systems, · The experienced SRE will play a crucial role in ensuring the reliabi ...
Gurugram3 weeks ago
-
Join us as a Site Reliability Engineer to support the improvement of non-functional and operational characteristics of our products and services. You'll enjoy significant stakeholder interaction, working in collaboration with engineers to ensure a principled approach to deliver c ...
Gurugram Full time1 month ago
-
+Job summary · We are seeking a seasoned Site Reliability Engineer with a solid background in payment systems and high-availability architectures.The ideal candidate will have hands-on experience managing large-scale, distributed systems in production, · +ResponsibilitiesDesign, ...
Gurugram1 month ago
-
We are seeking a talented and motivated Senior Site Reliability Engineer (SRE) to join our organization. · The team also manages and maintains internal tools/infra which is consumed by other development teams. · The experienced SRE will play a crucial role in ensuring the reliabi ...
Gurugram3 weeks ago
-
Work with customers to implement Observability solutions, build scalable systems, develop monitoring tools. · ...
Gurgaon / Gurugram Full time4 days ago
Site Reliability Engineer - Gurugram - ValueFirst
Description
About the Job
The Site Reliability Engineering (SRE) team is responsible for ensuring the reliability, scalability, and performance of large-scale telecom and CPaaS platforms. This role combines software engineering and systems operations to build resilient, observable, and automated infrastructure that supports high-throughput messaging services. The team operates in a 24/7 environment and works closely with Engineering, CX and Products to maintain carrier-grade service reliability.
What you'll be responsible for
What you'd have
-
Reliability Engineer
Only for registered members Gurugram.India
-
Reliability Engineer
Only for registered members Gurgaon, Haryana
-
Reliability and Maintainability Engineer
Only for registered members Gurugram, Haryana, India
-
Site Reliability Engineer
Only for registered members Gurgaon, Haryana, India
-
Application Reliability Engineer
Only for registered members Gurgaon, Haryana
-
Site Reliability Engineer
Only for registered members Gurgaon, Haryana, India
-
Site Reliability Engineer
Only for registered members Gurugram, Hyderabad
-
Application Reliability Engineer
Only for registered members Gurugram
-
Site Reliability Engineer
Only for registered members Gurugram, India
-
Site Reliability Engineer
Only for registered members Gurugram, Panchkula
-
Site Reliability Engineer
Only for registered members Gurugram
-
Site Reliability Engineer
Only for registered members Gurugram
-
Site Reliability Engineer
Only for registered members Gurugram, Haryana
-
Site Reliability Engineer
Only for registered members Gurgaon, Haryana
-
Site Reliability Engineer
Only for registered members Gurugram
-
Site Reliability Engineer
Only for registered members Gurugram
-
Site Reliability Engineer
Only for registered members Gurugram
-
Site Reliability Engineer
Full time Only for registered members Gurugram
-
Site Reliability Engineer
Only for registered members Gurugram
-
Site Reliability Engineer
Only for registered members Gurugram
-
Site Reliability Engineer
Full time confidential- Gurgaon / Gurugram