Senior DevOps Engineer - Udaipur, India - GKM IT

    GKM IT
    GKM IT Udaipur, India

    2 weeks ago

    Default job background
    Technology / Internet
    Description

    Company Introduction -

    GKM IT is an outsourcing company specializing in product development and technical consulting.

    We are consultants covering all aspects of product development - design, backend, frontend, mobile, digital, DevOps, etc. Our global presence can be found in Silicon Valley, Europe, Australia and India. The major domains we specialize in are - Fintech, Edutech, HealthCare, and Hospitality.

    Responsibilities:

    • Deploy updates and fixes
    • Implement and own the CI.
    • Manage CD tooling.
    • Implement and maintain monitoring and alerting.
    • Build and maintain highly available production systems.
    • Build tools to reduce occurrences of errors and improve customer experience
    • Develop software to integrate with internal back-end systems
    • Perform root cause analysis for production errors
    • Investigate and resolve technical issues
    • Develop scripts to automate visualization
    • Design procedures for system troubleshooting and maintenance

    Required Skills:

    • Linux - Strong knowledge of Linux / Process management / System Administration / Network Troubleshooting (telnet / netstat / ping / nmap / ngrep / nslookup / tcpdump / trace-route) / File system and disk management / DNS / IP-Addressing, subnetting, masking /
    • Scripting - Bash (Different type of automation and backups) / groovy / python scripting
    • Terraform - Must require implementation of highly available, scalable and fault tolerant multi-tier Azure and AWS environments spanning across multiple availability zones using Terraform (including terraform-workspace for different environment)
    • DR - Strong understanding of RTO and RPO with Disaster recovering planning and implementation to achieve zero-downtime deployment solutions (Setup alerting and monitoring for service downtime and proper actions against it)
    • Docker - Excellent knowledge of microservice architecture, docker, docker-compose, Networking, storage, Dockerization of different application stacks.
    • Jenkins - Excellent knowledge of Jenkins and CI/CD process (Including Master-slave architecture, Scripted and declarative pipelines, tools integrations, RBAC for user management, parameterized and schedule jobs, share library), Github-Actions knowledge would be a plus
    • Database - Experience in database management and optimization (Creation / Administration / debugging and Monitoring) —> Postgres, Mysql, MongoDB, Redis (Both on-prem and Cloud-services like RDS, DynamoDB, Elasticache or Azure managed Databases)
    • Ansible - Experience with Ansible and roles for different kind of automation, configuration management, server setup, patching etc
    • Monitoring - Good Experience with Prometheus, Gharana, Cloudwatch / Graylog any equivalent stack or tools
    • Logging - grafana-Loki-promtail, ELK or any equivalent stack or tools
    • Web and Proxy servers - Proficient in configuring and optimizing web servers and reverse proxy servers (Nginx, Apache, HaProxy)
    • Security and Compliance - Knowledge of Compliance and security practices for both Azure and AWS
    • Dive deep into the software stack to troubleshoot and resolve issues related to application development, deployment, and operations.
    • Performance tuning, monitoring, maintaining fault-tolerant/HA infrastructure to deliver highly scalable services

    Cloud Knowledge

    1) Azure

    Must have experience with azure services like -

    Active Directory

    Virtual Network

    Virtual Machine and scale sets

    load balancer and App-Gateway

    Web apps / App-service

    Containers-apps

    Functions

    Storage Account and Blobs

    Databases for Postgres and Mysql

    Computer Vision / Azure-Open-AI

    cognitive service

    Azure Kubernetes service - Including setup with terraform, Deployment best practices (Helm Charts and ArgoCD), logging and monitoring setup, security, scalability, Rolling updates and rollback strategies to maintain application availability, audit and update the Cluster with latest versions.

    Good experience with Azure-Devops pipelines for deployments and automations tasks

    1) AWS

    IAM (Users, Groups, Policies, Roles)

    VPC (NAT, Subnet, IGW, Security-groups, NACL, Endpoint-service, Peering)

    EC2, Auto-scaling

    S3, RDS, EFS, Elasticache, RedShift

    ALB, NLB, Transit-Gateway

    ECS (Fargate and EC2), EKS(Including deployment best practices with add-ons)

    CloudFront (CDN), Route53

    ServerLess (Lambda, DynamoDB, API-Gateway, EventBridge)

    SNS, SES, SQS

    AWS-DevOps (Code-pipeline, Code-deploy)

    CloudTrail, CloudWatch

    AWS-CLI (Automation)

    Control Tower, Landing-Zone, AWS-SSO