- Strong understanding of Modern single page web applications with Angular/React, NodeJS etc and mobile applications.
- Deep knowledge of monitoring and observability tools (e.g., Dynatrace, Prometheus, Grafana, ELK stack, Datadog, AppDynamics, New Relic, etc.)
- Familiarity with configuration management tools (Ansible, Puppet, etc.) and shell scripting
- Experience in Containerization tools like Docker, VM, Kubernetes.
- Strong knowledge towards SRE Principles into implementing monitoring.
- Implement and manage monitoring solutions to track the health and performance of services.
- Proactively monitor application stability.
- Set up alerting and automated responses to minimize downtime.
- Perform root cause analysis and manage incidents for issue resolution.
- Monitor system performance, identify bottlenecks, and collaborate on optimizations.
- Ensure the reliability and availability of our web applications by setting and meeting Service Level Objectives (SLOs).
- Collaborate with development teams to improve the overall reliability of applications and services.
- Develop and maintain automation scripts and tools for repetitive operational tasks.
- Maintain open communication with the Product Owner for product alignment.
- Ensure SRE tasks align with the product's strategic goals.
- Participate in backlog refinement meetings to prioritize SRErelated work items.
- Identify, document, and communicate defects and improvement opportunities.
- Conduct capacity planning to ensure that systems can handle expected loads.
- Analyze data and predict future resource requirements, scaling systems as needed.
- Participate in an oncall rotation to respond to incidents and outages promptly.
- Follow incident management procedures and conduct postincident reviews.
- Assess risks associated with changes to the production environment.
- Coordinate and execute deployments, ensuring rollback plans are in place.
- Analyze performance bottlenecks and work on optimizing systems for efficiency and costeffectiveness.
- Maintain comprehensive documentation for systems, processes, and procedures.
- Work closely with crossfunctional teams, including development, operations, and security, to achieve common goals.
- Foster a culture of reliability within the organization.
- Execute releases and contribute to the deployment process.
- Provide oncall support.
-
Site Reliability Engineer
Found in: Talent IN 2A C2 - 3 days ago
iLink Digital Chennai, India7 years of experience as a Site Reliability Engineer, DevOps Engineer, or similar role. · Strong expertise in Azure cloud services and solutions. · Proficiency in scripting and automation using PowerShell, Azure CLI, or similar tools. · Experience with infrastructure as code (IaC ...
-
Site Reliability Engineer
Found in: Talent IN C2 - 14 hours ago
SANTO SYSTEMS PRIVATE LIMITED Chennai, India Full timePosition: Site Reliability Engineer · Location: Chennai/Hyderabad · Experience: 5 TO 10 Years · Client: Mphasis · Site Reliability Engineer, Ansible, Shell Scripting, Linux, Monitoring tools- Splunk, Grafana, App Dynamics and Good to have Python · Detailed JD: · Hands-on Ansible ...
-
Site Reliability Engineer
Found in: Talent IN C2 - 14 hours ago
HRM INFO Chennai, India Full timeWe are hiring Lead Site Reliability Engineer - SRE Develops / SRE Coach at Chennai. · Skills: · DevSecOps, CICD, JIRA, Jenkins, SCM, PostgreSQL, MS SQL, Oracle, HTML 5, CSS, JSON, REST, SonarQube, Nexus · Experience level: Mid-senior · Experience required: 5 Years · Education le ...
-
Site Reliability Engineer
Found in: Talent IN C2 - 4 days ago
Epam Chennai, IndiaDescription · EPAM is a leading global provider of digital platform engineering and development services. We are committed to having a positive impact on our customers, our employees, and our communities. We embrace a dynamic and inclusive culture. Here you will collaborate with ...
-
Principal Network Reliability Engineer
Found in: Talent IN 2A C2 - 1 day ago
Intelsat Chennai, IndiaYour impactAs a leading contributor, you will be at the forefront of shaping DevNetOps practices aimed at enhancing network infrastructure and service delivery through an evolving Network Reliability Engineering discipline. You will be part of a team responsible for design, requi ...
-
Principal Site Reliability Engineer
Found in: Talent IN C2 - 4 days ago
Encora Inc. Chennai, IndiaImportant Information · Experience: 6 to 8 years · Job Location: Chennai · Position Type: Full time. · Work Mode- Hybrid (3 days in office) · Principal Site Reliability Engineer · About the Opportunity: · The Principal Site Reliability Engineer is vital in our Site Reliab ...
-
Lead Site Reliability Engineer
Found in: Talent IN C2 - 4 days ago
Epam Chennai, IndiaDescription · EPAM is a leading global provider of digital platform engineering and development services. We are committed to having a positive impact on our customers, our employees, and our communities. We embrace a dynamic and inclusive culture. Here you will collaborate with ...
-
Senior Site Reliability Engineer
Found in: Talent IN 2A C2 - 4 days ago
CODINCITY DIGITAL TECHNOLOGIES PRIVATE LIMITED Chennai, India permanentRole : Senior SRE Engineer · Requirements : · - Atleast 7+ years of experience with 3 years in SRE Role. · - Expert use of monitoring and incident response tools (Dynatrace or Datadog or New Relic, Grafana, PagerDuty, OpsGenie, Splunk), applying strategic approaches to incident r ...
-
EY - Site Reliability Engineer
Found in: Talent IN 2A C2 - 4 days ago
Ernst and Young Services Pvt. Ltd Chennai, IndiaSRE - FSD · Job Description : · Responsibilities : · - Gain proficiency with current Self-service, Automation and Master Data Applications and proactively improve them. · - Develop responsive, robust, and reusable UI components using React · - Writing high-quality, scalable code ...
-
Site Reliability Engineer(devops)
Found in: Talent IN C2 - 14 hours ago
Kiash Solutions LLP Chennai, India Full timeConsidering candidates that can onboard in 0-15 days. · Candidates must have over all 10-13 yrs of exp with (not consdiering candidates with more then 14 yrs of exp) · CTC 22 LPA · Location - Chennai · Work from office - Day 1 · Must have atleast 4 yrs exp with Site Reliability E ...
-
Site Reliability Engineer/Cloud Engineering Manager
Found in: Talent IN 2A C2 - 4 days ago
Talent500 Chennai, IndiaJob Description : · Role : Senior Engineering Manager Cloud Engineer Site Reliability Engineering for Ford Credit Tech · We are passionate about building software that solves problems. · We count on our Site Reliability Engineers (SREs) to empower our users with a rich feature se ...
-
Principal DevOps/Site Reliability Engineer
Found in: Talent IN 2A C2 - 4 days ago
CODINCITY DIGITAL TECHNOLOGIES PRIVATE LIMITED Chennai, India permanentJob Description : · Bachelor's degree in computer science engineering or related field. · - 10+ Years of experience in DevOps/System administration roles with atleast 3 years in a senior roles. · - Strong proficiency in Azure/GCP services, including compute, networking, storage ...
-
Kubernetes Site Reliability Engineer- Chennai, Tamil Nadu, India
Found in: Talent AU C2 - 4 days ago
Axiom Technologies Chennai, India Casual, Full timeAxiom Technologies is a Global IT Services partner supporting medium to large-scale enterprises. Please visit our website for more information about what we do at . · Roles and Responsibilities: · Continuously build & deploy (CICD) multiple products across multiple lower level a ...
-
Associate Director
Found in: Lensa US 4 C2 - 5 days ago
Tata Communications Chennai, IndiaBroad outline of the Role · Responsible for planning, designing, developing, and enhancing engineering solutions and services riding on network layers such as CDN, Enterprise, Ethernet, IP, Wireless, Mobile Broadband, etc., with the objective of providing efficient, secure, cost ...
-
Software Engineer
Found in: Talent IN 2A C2 - 4 days ago
Rapyuta Robotics Chennai, IndiaRapyuta Robotics is an ETH Zurich startup headquartered in Tokyo and operates with a vision to become the global leader in making robots more accessible. We lead the pick-assist AMR market in Japan and are backed by Goldman Sachs, Sony, and Yaskawa as investors. · We are opening ...
-
Developer
Found in: Talent IN 2A C2 - 2 days ago
HTC Global Services Chennai, IndiaJob Description · The Software Engineer – Operations Engineering is a critical member of the Technical Operations Team who works in a software development studio dedicated to reliability engineering. The studio is responsible for creating and improving projects that automate rout ...
-
SRE – Cloud Engineer T500-10529
Found in: Talent IN 2A C2 - 4 days ago
Talent500 Chennai, IndiaCloud Engineer - Site Reliability Engineering for Ford Credit Tech · We're passionate about building software that solves problems. We count on our Site Reliability Engineers (SREs) to empower our users with a rich feature set, high availability, and stellar performance level to ...
-
SRE – Cloud Engineering Manager T500-10575
Found in: Talent IN 2A C2 - 4 days ago
Talent500 Chennai, IndiaSenior Engineering Manager - Cloud Engineer - Site Reliability Engineering for Ford Credit Tech: · We are passionate about building software that solves problems. We count on our Site Reliability Engineers (SREs) to empower our users with a rich feature set, high availability, an ...
-
Big Data Developer
Found in: Talent IN 2A C2 - 5 days ago
RiskInsight Consulting Pvt Ltd. Chennai, IndiaSeeking experienced Spark and Scala Developers proficient in designing, implementing, and maintaining scalable data processing solutions using Apache Spark and Scala · Responsibilities include developing and optimizing Spark jobs, working closely with data engineers and data scie ...
-
Lead WFM
Found in: Talent IN 2A C2 - 20 hours ago
Movate Chennai, IndiaDesignation: Lead WFM · Shifts: 24x7 ( including night shifts ) – Shifts will be assigned based on the business requirement. · Work Location: Chennai, Perungalathur ( work from Office ) · Roles & Responsibilities: · Monitor queue for adherence · Monitor queue and track inbound a ...
Site Reliability Engineer - Chennai, India - HNM Solutions
Description
Role :
Site Reliability Engineer
Location :
Chennai, India
Experience : 12 years
Description :
Candidate should have a strong background in both software engineering, Monitoring and operations, with a focus on ensuring the reliability, performance, and scalability of our web applications.
Skills :
AWS Cloud :
VPC, subnets, network access control lists, security groups, EC2 instances, S3 buckets, IAM, Route 53, Lambda.
Responsibilities :
Monitoring and Alerting :
Service Reliability :
Automation :
Product Continuous Improvement :
Capacity Planning :
Incident Response :
Change Management :
Performance Analysis :
Documentation :
Collaboration :
Other :