Database Reliability Engineer - Bangalore, India - Zyoin group

    Zyoin group
    Zyoin group Bangalore, India

    2 weeks ago

    Default job background
    permanent Technology / Internet
    Description

    Job Description :


    We are looking for a database reliability engineer/sr database reliability engineer to help us build and enhance our database platforms to achieve availability, scalability, and operational effectiveness, the right individual will embrace the opportunity to tackle challenging problems and use their influence to drive continual improvement, you will also work on cutting-edge technologies like mysql, aerospike, elasticsearch, rocksdb, couchbase, mongodb, kubernetes, jenkins, chef, prometheus, etc.


    Responsibilities :


    Managing availability, performance, capacity, and security of database infrastructure like mysql, aerospike, couchbase, elasticsearch, cassandra, scylladb, mongodb, rocksdb, postgresql building and implementing observability for databases' health/performance/capacity.


    • Optimizing oncall rotations and processes.
    • Documenting tribal knowledge.
    • Conversant with concepts related to rdbms and nosql databases.
    • Conversant with object relational mapping technology.
    • Providing help in onboarding new databases and clusters with the production readiness review process.
    • Developing tools to manage database infrastructure at scale.
    • Providing reports on services slo/error budgets/alerts and operational overhead.
    • Working with dev and product teams to define slo/error budgets/alerts for databases.
    • Working with the dev team to have an indepth understanding of the application data models and advising them to choose the right database.
    • Identifying observability gaps in database services, and infrastructure and working with stakeholders to fix it.
    • Managing outages doing detailed rca with developers and identifying ways to avoid that situation.
    • Managing/automating upgrades of the database infrastructure services.
    Automate toil workrequirements :


    4+ years of experience as a database reliability engineer on large-scale databases like mysql, cassandra, scylladb, aerospike, elasticsearch, rocksdb, couchbase, and mongodb.


    • A collaborative spirit with the ability to work across disciplines to influence, learn, and deliver.
    • A deep understanding of computer science, networking, and database concepts.
    • Demonstrated experience with languages, such as python, bash, etc.
    • Extensive experience with linux administration and a good understanding of the various linux kernel subsystems (memory, storage, network, etc).
    • Extensive experience in dns, tcp/ip, udp, grpc, routing and load balancing.
    • Expertise in amazon web services (aws) and/or other relevant cloud infrastructure solutions like microsoft azure or google cloud.
    • Expertise in gitops, infrastructure as code tools such as terraform, etc is a plus.
    • Experience in managing and deploying containerized environments using docker, mesos/kubernetes is a plus.
    )