
Durgesh Singh
Technology / Internet
About Durgesh Singh:
Cloud & DevOps Engineer with expertise in AWS, OpenStack, OpenShift, Linux, Kubernetes, and CI/CD automation. Adept at designing and managing cloud infrastructure, monitoring solutions,and containerized deployments. Passionate about DevOps automation, cloud security, and site reliability engineering (SRE).
Experience
OpenStack Infrastructure Management
Managed OpenStack infrastructure across compute, storage, and networking services,
ensuring 24/7 availability.
Integrated monitoring tools like Grafana, Kibana, and Prometheus for 70+ Pan-India
sites.
Expanded compute, KVM, and Ceph nodes on live sites, following proper CR and
incident timelines.
Configured and monitored virtual machine (VM) health and performance using
Prometheus and custom exporters.
Performed OpenStack upgrades and migrations with minimal downtime, ensuring
seamless service continuity.
Automated routine tasks and system upgrades to enhance operational eiciency.
Collaborated with cross-functional teams to troubleshoot performance bottlenecks in
Nova, Neutron, and Cinder services.
Validated MBSS documents for the H-Cloud Project while implementing security
hardening concepts.
Managed and configured backups for all sites using the Commvault backup tool.
Monitoring and Observability
Deployed and maintained monitoring tools (Prometheus, Thanos, Grafana) for real-time
observability and long-term metrics storage.
Integrated Thanos for scalable and centralized metric queries across distributed
Prometheus instances.
Utilized Prometheus CLI query language and Kibana to advance AI/ML use cases.
Performed all testing and alert configuration for site expansions and new node
validations.
Created custom dashboards and set up alerts for critical infrastructure metrics.
Site Expansion Projects
Led site expansion projects by provisioning compute and storage resources using Heat
templates and manual orchestration.
Validated and tested newly expanded nodes to ensure optimal performance and
reliability.
Incident and SLA Management
Created and resolved TTs (Trouble Tickets) and CRs (Change Requests) for all issues and
activities, ensuring compliance with SLA.
Monitored and ensured the timely delivery of health checkup reports using Python and
Linux scripting.
OpenShift and Container Management
Managed multiple projects within OpenShift, handling resource allocation and scaling
for services and pods.
Created, updated, and deleted ConfigMaps and Secrets to manage dynamic application
configurations.
Integrated Prometheus and Grafana for real-time monitoring and improved cluster
observability.
Troubleshot pod failures and network issues, achieving 99.9% application uptime.
Monitored pod health and resolved performance issues using OpenShift Web Console
and CLI tools.
Contributed to automation scripts for container builds, deployment rollbacks, and log
collection.
Data Analytics and Reporting
Integrated multiple network components in the NLM project using data analytics and
Excel skills on Logstash and Elasticsearch.
Utilized advanced data analysis techniques to ensure accurate insights and system
performance.
Education
OpenStack Infrastructure Management
Managed OpenStack infrastructure across compute, storage, and networking services,
ensuring 24/7 availability.
Integrated monitoring tools like Grafana, Kibana, and Prometheus for 70+ Pan-India
sites.
Expanded compute, KVM, and Ceph nodes on live sites, following proper CR and
incident timelines.
Configured and monitored virtual machine (VM) health and performance using
Prometheus and custom exporters.
Performed OpenStack upgrades and migrations with minimal downtime, ensuring
seamless service continuity.
Automated routine tasks and system upgrades to enhance operational eiciency.
Collaborated with cross-functional teams to troubleshoot performance bottlenecks in
Nova, Neutron, and Cinder services.
Validated MBSS documents for the H-Cloud Project while implementing security
hardening concepts.
Managed and configured backups for all sites using the Commvault backup tool.
Monitoring and Observability
Deployed and maintained monitoring tools (Prometheus, Thanos, Grafana) for real-time
observability and long-term metrics storage.
Integrated Thanos for scalable and centralized metric queries across distributed
Prometheus instances.
Utilized Prometheus CLI query language and Kibana to advance AI/ML use cases.
Performed all testing and alert configuration for site expansions and new node
validations.
Created custom dashboards and set up alerts for critical infrastructure metrics.
Site Expansion Projects
Led site expansion projects by provisioning compute and storage resources using Heat
templates and manual orchestration.
Validated and tested newly expanded nodes to ensure optimal performance and
reliability.
Incident and SLA Management
Created and resolved TTs (Trouble Tickets) and CRs (Change Requests) for all issues and
activities, ensuring compliance with SLA.
Monitored and ensured the timely delivery of health checkup reports using Python and
Linux scripting.
OpenShift and Container Management
Managed multiple projects within OpenShift, handling resource allocation and scaling
for services and pods.
Created, updated, and deleted ConfigMaps and Secrets to manage dynamic application
configurations.
Integrated Prometheus and Grafana for real-time monitoring and improved cluster
observability.
Troubleshot pod failures and network issues, achieving 99.9% application uptime.
Monitored pod health and resolved performance issues using OpenShift Web Console
and CLI tools.
Contributed to automation scripts for container builds, deployment rollbacks, and log
collection.
Data Analytics and Reporting
Integrated multiple network components in the NLM project using data analytics and
Excel skills on Logstash and Elasticsearch.
Utilized advanced data analysis techniques to ensure accurate insights and system
performance.
Professionals in the same Technology / Internet sector as Durgesh Singh
Professionals from different sectors near Noida, Gautam Buddha Nagar
Other users who are called Durgesh
Jobs near Noida, Gautam Buddha Nagar
-
We're expanding our Product Engineering team at Saarthee, an agile, innovation-led group focused on designing, piloting, and scaling data, analytics, and AI-powered platforms. · We're looking for DevOps Engineers - Product Infrastructure who will own deployment, · scaling, · and ...
Gurugram1 month ago
-
We are seeking an experienced freelance trainer with strong expertise in cloud computing with DevOps. · ...
Noida1 month ago
-
We are looking for DevOps Engineers to set up and manage CI/CD pipelines automate infrastructure provisioning monitor system health collaborate with engineering teams maintain scripts optimize cost. · Set up and manage CI/CD pipelines (e.g., Jenkins, GitHub Actions, GitLab CI). · ...
New Delhi3 weeks ago