Sre + Production Support 6 to 9 Years Chennai - Capgemini
Description
Job Description- Reliability Engineer Identify and manage asset reliability risks that could adversely affect plant or business operations.
- Creates frameworks, models and approaches for better governance and driving reliability improvement initiatives.
- Devised emergency response methods either by being oncall or by reacting to symptoms according to monitoring and escalation when needed.
- Proposes and drives architectural changes that affect the whole platform to solve scaling and performance problems
- Analyze existing, create and maintain new Service Level Objectives, scale systems through automation, improving change velocity and building reliability.
- Troubleshoot,evaluate and resolve operational challenges contributing to defined SLO's.
- Work in close collaboration with software development teams to shape the future roadmap and establish strong operational readiness across teams.
- Deep Understanding of DevOps and SRE context with great understanding of Key Metrics for Managing
SRE:
( MTTD, MTTR, MTBF).
- Primary Skills
- Monitoring: New Relic, Splunk,AppInsights,Log Analytics/Grafana/Prometheus,Dynatrace
- CI-CD: Jenkins, Azure DevOps
- Programming: Java,.NET
- SLA/SLI/SLO
- Secondary Skills
- A proactive approach to spotting problems, areas for improvement, and performance bottlenecks.
- Automation frameworks expertise any one (UI Path / Bash / PowerShell)
- Kubernetes and containerizing our system (Either EKS, AKS or K8s)
More jobs from Capgemini
-
Adobe Experience Platform 6 to 9 Years Pune
Kolkata, West Bengal, India - 1 week ago
-
Cloud Containerization 6 to 12 Years Bengaluru
Bengaluru, India - 2 days ago
-
Oracle Dba 12 to 15 Years Pan India
Mumbai, India - 6 days ago
-
sap abap c1
Bengaluru, India - 3 days ago
-
oracle dba
Bengaluru, India - 11 hours ago
-
performance Testing
Chennai, India - 5 days ago