DevOps Engineer - Greater Bengaluru Area, India - Groww

    Groww
    Groww Greater Bengaluru Area, India

    2 weeks ago

    Default job background
    Accounting / Finance
    Description

    About Groww:

    We are a passionate group of people focused on making financial services accessible to every Indian through a multi-product platform. Each day, we help millions of customers take charge of their financial journey. Customer obsession is in our DNA. Every product, every design, every algorithm down to the tiniest detail is executed keeping the customers' needs and convenience in mind. Our people are our greatest strength. Everyone at Groww is driven by ownership, customer-centricity, integrity and the passion to constantly challenge the status quo.

    Are you as passionate about defying conventions and creating something extraordinary as we are? Let's chat.

    Company Vision:

    Every individual deserves the knowledge, tools, and confidence to make informed financial decisions. At Groww, we are making sure every Indian feels empowered to do so through a cutting-edge multi-product platform offering a variety of financial services. Our long-term vision is to become the trusted financial partner for millions of Indians.

    Our Values:

    Our culture enables us to be what we are — India's fastest-growing financial services company. It fosters an environment where collaboration, transparency, and open communication take center-stage and hierarchies fade away. There is space for every individual to be themselves and feel motivated to bring their best to the table, as well as craft a promising career for themselves. The values that form our foundation are:

    • Radical customer centricity
    • Ownership-driven culture
    • Keeping everything simple
    • Long-term thinking
    • Complete transparency.

    EXPERTISE AND QUALIFICATIONS

    What you'll do

    • Bridging the gaps b/w Core Infra and development team.
    • Owning the end-to-end Availability, Performance, Capacity of applications and their infrastructure and creating/maintaining the respective observability with Grafana LGTM.
    • Providing 24X7 infra & app support, building processes and documenting "tribal" knowledge around the same time.
    • Mentor and train L1 engineers and continually improve app and infra support processes.
    • Analyze the data in Mimir, Loki and Tempo and write appropriate promQl (or similar querying language) to show the data
    • Provide analytics on data, alerts, application health, etc.
    • Create meaningful dashboards to visualize the data based on the stakeholder
    • Managing the SLO/Error Budgets/Alerts and performing root cause analysis for production errors.
    • Working with Core Infra, Dev and Product teams to define SLO/Error Budgets/Alerts.
    • Working with the Dev team to have an in-depth understanding of the application architecture and its bottlenecks.
    • Identifying observability gaps in application & infrastructure and working with stakeholders to fix them.
    • Managing outages and doing detailed RCA with developers and identifying ways to avoid that situation.

    What We're Looking For:

    • 3 to 6 Years of experience in managing high traffic, large scale microservices and infrastructure with excellent troubleshooting skills.
    • Experience in troubleshooting, managing and deploying containerized environments using Docker/containerd, Kubernetes is a must.
    • Must be proficient with helm.
    • Must be very hands-on in managing and troubleshooting the Kubernetes environment.
    • Must be proficient with promQL, NRQL, LogQL, TraceQL.
    • Must have a good understanding of Grafana LGTM stack.
    • Extensive experience with Linux administration and a good understanding of the various Linux kernel subsystems (memory, storage, network etc).
    • Extensive experience in DNS, TCP/IP, UDP, GRPC, Routing and Load Balancing.
    • Expertise in Google Cloud (GCP) and/or other relevant Cloud Infrastructure solutions like AWS or Azure.
    • Experience with multiple datastores is a plus (Kafka/RabbitMQ, Redis, Elasticsearch).
    • Must be proficient in Google SRE practices.
    • Must be good in any of the DevOps scripting languages - python or go.
    • A collaborative spirit with the ability to work across disciplines to influence, learn and deliver.
    • A deep understanding of computer science, software development, and networking principles.