
Durgesh Singh
Technology / Internet
About Durgesh Singh:
Cloud & DevOps Engineer with expertise in AWS, OpenStack, OpenShift, Linux, Kubernetes, and CI/CD automation. Adept at designing and managing cloud infrastructure, monitoring solutions,and containerized deployments. Passionate about DevOps automation, cloud security, and site reliability engineering (SRE).
Experience
OpenStack Infrastructure Management
Managed OpenStack infrastructure across compute, storage, and networking services,
ensuring 24/7 availability.
Integrated monitoring tools like Grafana, Kibana, and Prometheus for 70+ Pan-India
sites.
Expanded compute, KVM, and Ceph nodes on live sites, following proper CR and
incident timelines.
Configured and monitored virtual machine (VM) health and performance using
Prometheus and custom exporters.
Performed OpenStack upgrades and migrations with minimal downtime, ensuring
seamless service continuity.
Automated routine tasks and system upgrades to enhance operational eiciency.
Collaborated with cross-functional teams to troubleshoot performance bottlenecks in
Nova, Neutron, and Cinder services.
Validated MBSS documents for the H-Cloud Project while implementing security
hardening concepts.
Managed and configured backups for all sites using the Commvault backup tool.
Monitoring and Observability
Deployed and maintained monitoring tools (Prometheus, Thanos, Grafana) for real-time
observability and long-term metrics storage.
Integrated Thanos for scalable and centralized metric queries across distributed
Prometheus instances.
Utilized Prometheus CLI query language and Kibana to advance AI/ML use cases.
Performed all testing and alert configuration for site expansions and new node
validations.
Created custom dashboards and set up alerts for critical infrastructure metrics.
Site Expansion Projects
Led site expansion projects by provisioning compute and storage resources using Heat
templates and manual orchestration.
Validated and tested newly expanded nodes to ensure optimal performance and
reliability.
Incident and SLA Management
Created and resolved TTs (Trouble Tickets) and CRs (Change Requests) for all issues and
activities, ensuring compliance with SLA.
Monitored and ensured the timely delivery of health checkup reports using Python and
Linux scripting.
OpenShift and Container Management
Managed multiple projects within OpenShift, handling resource allocation and scaling
for services and pods.
Created, updated, and deleted ConfigMaps and Secrets to manage dynamic application
configurations.
Integrated Prometheus and Grafana for real-time monitoring and improved cluster
observability.
Troubleshot pod failures and network issues, achieving 99.9% application uptime.
Monitored pod health and resolved performance issues using OpenShift Web Console
and CLI tools.
Contributed to automation scripts for container builds, deployment rollbacks, and log
collection.
Data Analytics and Reporting
Integrated multiple network components in the NLM project using data analytics and
Excel skills on Logstash and Elasticsearch.
Utilized advanced data analysis techniques to ensure accurate insights and system
performance.
Education
OpenStack Infrastructure Management
Managed OpenStack infrastructure across compute, storage, and networking services,
ensuring 24/7 availability.
Integrated monitoring tools like Grafana, Kibana, and Prometheus for 70+ Pan-India
sites.
Expanded compute, KVM, and Ceph nodes on live sites, following proper CR and
incident timelines.
Configured and monitored virtual machine (VM) health and performance using
Prometheus and custom exporters.
Performed OpenStack upgrades and migrations with minimal downtime, ensuring
seamless service continuity.
Automated routine tasks and system upgrades to enhance operational eiciency.
Collaborated with cross-functional teams to troubleshoot performance bottlenecks in
Nova, Neutron, and Cinder services.
Validated MBSS documents for the H-Cloud Project while implementing security
hardening concepts.
Managed and configured backups for all sites using the Commvault backup tool.
Monitoring and Observability
Deployed and maintained monitoring tools (Prometheus, Thanos, Grafana) for real-time
observability and long-term metrics storage.
Integrated Thanos for scalable and centralized metric queries across distributed
Prometheus instances.
Utilized Prometheus CLI query language and Kibana to advance AI/ML use cases.
Performed all testing and alert configuration for site expansions and new node
validations.
Created custom dashboards and set up alerts for critical infrastructure metrics.
Site Expansion Projects
Led site expansion projects by provisioning compute and storage resources using Heat
templates and manual orchestration.
Validated and tested newly expanded nodes to ensure optimal performance and
reliability.
Incident and SLA Management
Created and resolved TTs (Trouble Tickets) and CRs (Change Requests) for all issues and
activities, ensuring compliance with SLA.
Monitored and ensured the timely delivery of health checkup reports using Python and
Linux scripting.
OpenShift and Container Management
Managed multiple projects within OpenShift, handling resource allocation and scaling
for services and pods.
Created, updated, and deleted ConfigMaps and Secrets to manage dynamic application
configurations.
Integrated Prometheus and Grafana for real-time monitoring and improved cluster
observability.
Troubleshot pod failures and network issues, achieving 99.9% application uptime.
Monitored pod health and resolved performance issues using OpenShift Web Console
and CLI tools.
Contributed to automation scripts for container builds, deployment rollbacks, and log
collection.
Data Analytics and Reporting
Integrated multiple network components in the NLM project using data analytics and
Excel skills on Logstash and Elasticsearch.
Utilized advanced data analysis techniques to ensure accurate insights and system
performance.
Professionals in the same Technology / Internet sector as Durgesh Singh
Professionals from different sectors near Noida, Gautam Buddha Nagar
Other users who are called Durgesh
Jobs near Noida, Gautam Buddha Nagar
-
We are seeking a talented and motivated · Software Engineer with expertise in DevOps tools to join our team. ...
New Delhi2 weeks ago
-
We are looking for a skilled DevOps Engineer with experience to join our IT team. · Automating, deploying, and maintaining cloud-based infrastructure. · Maintaining system reliability and performance. · ...
Delhi Cantonment3 weeks ago
-
We are looking for a DevOps Engineer to join our team in Gurgaon/Gurugram. As a DevOps Engineer, you will be responsible for designing and managing secure scalable AWS infrastructure using IaC tools like Terraform CloudFormation or AWS CDK. · ...
Gurgaon/Gurugram1 month ago