
Durgesh Singh
Technology / Internet
About Durgesh Singh:
Cloud & DevOps Engineer with expertise in AWS, OpenStack, OpenShift, Linux, Kubernetes, and CI/CD automation. Adept at designing and managing cloud infrastructure, monitoring solutions,and containerized deployments. Passionate about DevOps automation, cloud security, and site reliability engineering (SRE).
Experience
OpenStack Infrastructure Management
Managed OpenStack infrastructure across compute, storage, and networking services,
ensuring 24/7 availability.
Integrated monitoring tools like Grafana, Kibana, and Prometheus for 70+ Pan-India
sites.
Expanded compute, KVM, and Ceph nodes on live sites, following proper CR and
incident timelines.
Configured and monitored virtual machine (VM) health and performance using
Prometheus and custom exporters.
Performed OpenStack upgrades and migrations with minimal downtime, ensuring
seamless service continuity.
Automated routine tasks and system upgrades to enhance operational eiciency.
Collaborated with cross-functional teams to troubleshoot performance bottlenecks in
Nova, Neutron, and Cinder services.
Validated MBSS documents for the H-Cloud Project while implementing security
hardening concepts.
Managed and configured backups for all sites using the Commvault backup tool.
Monitoring and Observability
Deployed and maintained monitoring tools (Prometheus, Thanos, Grafana) for real-time
observability and long-term metrics storage.
Integrated Thanos for scalable and centralized metric queries across distributed
Prometheus instances.
Utilized Prometheus CLI query language and Kibana to advance AI/ML use cases.
Performed all testing and alert configuration for site expansions and new node
validations.
Created custom dashboards and set up alerts for critical infrastructure metrics.
Site Expansion Projects
Led site expansion projects by provisioning compute and storage resources using Heat
templates and manual orchestration.
Validated and tested newly expanded nodes to ensure optimal performance and
reliability.
Incident and SLA Management
Created and resolved TTs (Trouble Tickets) and CRs (Change Requests) for all issues and
activities, ensuring compliance with SLA.
Monitored and ensured the timely delivery of health checkup reports using Python and
Linux scripting.
OpenShift and Container Management
Managed multiple projects within OpenShift, handling resource allocation and scaling
for services and pods.
Created, updated, and deleted ConfigMaps and Secrets to manage dynamic application
configurations.
Integrated Prometheus and Grafana for real-time monitoring and improved cluster
observability.
Troubleshot pod failures and network issues, achieving 99.9% application uptime.
Monitored pod health and resolved performance issues using OpenShift Web Console
and CLI tools.
Contributed to automation scripts for container builds, deployment rollbacks, and log
collection.
Data Analytics and Reporting
Integrated multiple network components in the NLM project using data analytics and
Excel skills on Logstash and Elasticsearch.
Utilized advanced data analysis techniques to ensure accurate insights and system
performance.
Education
OpenStack Infrastructure Management
Managed OpenStack infrastructure across compute, storage, and networking services,
ensuring 24/7 availability.
Integrated monitoring tools like Grafana, Kibana, and Prometheus for 70+ Pan-India
sites.
Expanded compute, KVM, and Ceph nodes on live sites, following proper CR and
incident timelines.
Configured and monitored virtual machine (VM) health and performance using
Prometheus and custom exporters.
Performed OpenStack upgrades and migrations with minimal downtime, ensuring
seamless service continuity.
Automated routine tasks and system upgrades to enhance operational eiciency.
Collaborated with cross-functional teams to troubleshoot performance bottlenecks in
Nova, Neutron, and Cinder services.
Validated MBSS documents for the H-Cloud Project while implementing security
hardening concepts.
Managed and configured backups for all sites using the Commvault backup tool.
Monitoring and Observability
Deployed and maintained monitoring tools (Prometheus, Thanos, Grafana) for real-time
observability and long-term metrics storage.
Integrated Thanos for scalable and centralized metric queries across distributed
Prometheus instances.
Utilized Prometheus CLI query language and Kibana to advance AI/ML use cases.
Performed all testing and alert configuration for site expansions and new node
validations.
Created custom dashboards and set up alerts for critical infrastructure metrics.
Site Expansion Projects
Led site expansion projects by provisioning compute and storage resources using Heat
templates and manual orchestration.
Validated and tested newly expanded nodes to ensure optimal performance and
reliability.
Incident and SLA Management
Created and resolved TTs (Trouble Tickets) and CRs (Change Requests) for all issues and
activities, ensuring compliance with SLA.
Monitored and ensured the timely delivery of health checkup reports using Python and
Linux scripting.
OpenShift and Container Management
Managed multiple projects within OpenShift, handling resource allocation and scaling
for services and pods.
Created, updated, and deleted ConfigMaps and Secrets to manage dynamic application
configurations.
Integrated Prometheus and Grafana for real-time monitoring and improved cluster
observability.
Troubleshot pod failures and network issues, achieving 99.9% application uptime.
Monitored pod health and resolved performance issues using OpenShift Web Console
and CLI tools.
Contributed to automation scripts for container builds, deployment rollbacks, and log
collection.
Data Analytics and Reporting
Integrated multiple network components in the NLM project using data analytics and
Excel skills on Logstash and Elasticsearch.
Utilized advanced data analysis techniques to ensure accurate insights and system
performance.
Professionals in the same Technology / Internet sector as Durgesh Singh
Professionals from different sectors near Noida, Gautam Buddha Nagar
Other users who are called Durgesh
Jobs near Noida, Gautam Buddha Nagar
-
DevOps/Cloud computing Mentor
1 month ago
LearnComet New Delhi, DelhiWe're looking for an experienced DevOps & Cloud professional who believes in teaching through real-world practice, · not theory. This is a paid, short-term, remote engagement focused on preparing learners for real industry roles—not classroom-style lectures.Deliver hands-on DevOp ...
-
DevOps Engineer
2 weeks ago
Wizard Infoways Pvt. Ltd. NoidaThis full-time position for a DevOps Engineer is based on-site in Noida with 3+ Years of Experience. · The primary responsibilities include deploying infrastructure as code, managing system administration tasks, overseeing continuous integration processes and leveraging Linux sys ...
-
DevOps Engineer
1 week ago
Wizard Infoways Pvt. Ltd. Noida, Uttar PradeshThis role for a DevOps engineer is based on-site in Noida with experience of over 3 years. The primary responsibilities include deploying infrastructure as code and managing system administration tasks. · ...