Jobs
>
Hyderabad

    Database Reliability Engineer - Hyderabad, India - Splunk Inc

    Splunk Inc
    Default job background
    Description
    Join us as we pursue our ground-breaking vision to make machine data accessible, usable, and valuable to everyone. We are a company filled with people who are passionate about our product and seek to deliver the best experience for our customers. At Splunk, we are committed to our work, customers, having fun, and most significantly to each other's success. The provides full-fidelity monitoring and troubleshooting across infrastructure, applications, and user interfaces, in real-time and at any scale, to help our customers keep their services reliable, innovate faster, and deliver great customer experiences. Infrastructure Software Engineers at Splunk are cloud-native systems engineers who use infrastructure-as-code, microservices, automation, and efficient design to build, operate, and scale our products.RoleYou will help us run one of the largest and most sophisticated cloud-scale, big data, and microservices platforms in the world. You will be responsible for enabling developers to operate highly available, scalable, and cost-efficient applications with low operational burden by managing and improving the reliability and resiliency of SRE-managed services and infrastructure. You thrive on automation, infrastructure-as-code, reliability engineering, and getting rid of tedious, manual tasks. You will:
  • Design new services, tools, and monitoring to be implemented by the entire team.
  • Analyze the tradeoffs of the proposed design and make recommendations based on these tradeoffs.
  • Mentor new engineers to achieve more than they thought possible. You enjoy making other teams successful and are fulfilled through the success of others.
  • Work on database reliability projects, including:
  • HA, Business Continuity Planning, disaster recovery, backup/restore, RTO, RPO
  • Database uptime and performance
  • Capacity management & planning
  • SLIs, SLOs, error budgets, and monitoring dashboards
  • Responsible for deployment and operations of large-scale distributed data stores and streaming services
  • Establishing design patterns for monitoring and benchmarking
  • Establishing and documenting production run books and guidelines for developers
  • Tooling, toil reduction, runbooks & automation to manage production environments
  • Incident management and improving MTTD/MTTR for services
  • Cloud cost optimization
  • QualificationsMandatory
  • 3+ years of experience with deployment, operations, and performance management of large-scale Cassandra clusters along with Zookeeper.
  • 2+ years of experience in managing large-scale cloud-native microservices platforms.
  • Experience with infrastructure automation and scripting using Python and/or bash scripting.
  • Excellent problem-solving, troubleshooting, and debugging skills in large-scale distributed systems
  • Preferred
  • Confluent Certified Administrator for Apache Kafka and/or Apache Cassandra Administrator Associate certifications are preferred
  • AWS Solutions Architect certification preferred.
  • Strong hands-on experience deploying, managing, and monitoring large-scale Kubernetes clusters in the public cloud specifically AWS or GCP
  • Experience with deployment, operations, and performance management of one or more of the following large-scale clusters such as Cassandra, Kafka, Elastic/Open Search, MongoDB, ZooKeeper, Redis, etc.
  • Experience with Infrastructure-as-Code using Terraform, CloudFormation, Google Deployment Manager, Pulumi, Packer, ARM, etc.
  • Proven skills to effectively work across teams and functions to influence the design, operations, and deployment of highly available software.
  • Bachelors/Masters in Computer Science, Engineering, or related technical field, or equivalent practical experience.


  • Wall Street Consulting Services LLC Hyderabad, India

    Role: SRE · Exp: 6+ years · Location: Pune, Bengaluru, Chennai · JOB DESCRIPTION: · The Role: · As a Site Reliability Engineer, you will be critical in ensuring our software products' reliability, scalability, and performance. You will be responsible for designing and implementin ...


  • Quiktrak, LLC Hyderabad, India

    Job Title: Azure Site Reliability Engineer (SRE) / DevOps Engineer · Job Description: · Summary: · As an Azure Site Reliability Engineer (SRE) / DevOps Engineer, you will be responsible for ensuring the reliability, scalability, and performance of our cloud infrastructure on the ...


  • ValueLabs Hyderabad, India

    Experienced in SRE or Site Reliability Engineer · Design, implement, and maintain automated processes for deploying, monitoring, and managing applications on Azure DevOps. · Collaborate with cross-functional teams to optimize system performance, reliability, and scalability. · D ...


  • PURVIEW Hyderabad, India

    Role: Site Reliability Engineer · Location: Hyderabad · Job Type: Contract (Permanent to Purview Services) · NOTE: Client is looking for immediate joiners or 1 Month Notice Period candidates only. · Job Description: · Primary job responsibilities · Ability to operate and maintai ...


  • SID Global Solutions Hyderabad, India

    Job Title: Site Reliability Engineer · Location: Hyderabad - Onsite · Work Mode: 5 Days Working from Office · JOB DESCRIPTION · • Experience in Cloud administration and troubleshooting(GCP is recommended) or AWS or AZURE · • Experience with Kubernetes or comparable technolog ...


  • DATAMTX LLC Hyderabad, India

    The Company · Datamtx / formerly Datamatics) established in 1993 and globally HQ'd in Atlanta has a stellar history supporting both Tier 1 and 2 ERP rollouts ranging from implementations, data cleanse, migrations, customization, hypercare and Day 1 support. We are also nationall ...


  • Wall Street Consulting Services LLC Hyderabad, India

    Role: SREExp: 6+ yearsLocation: Pune, Bengaluru, Chennai · JOB DESCRIPTION:The Role:As a Site Reliability Engineer, you will be critical in ensuring our software products' reliability, scalability, and performance. You will be responsible for designing and implementing highly ava ...


  • Banyan Cloud Hyderabad, India

    About US · Honest Data technologies Pvt Ltd, is a wholly owned subsidiary of Banyan Cloud, USA, the Cyber Security Product Company, headquartered in San Jose, California, USA, owning the SaaS product "Banyan Cloud", first of its kind Cyber Security CNAP Platform that simplifies t ...


  • Zyoin group Hyderabad, India

    We are seeking a highly skilled and experienced Database Reliability Engineer (DBRE) to join our team and play a crucial role in ensuring the performance, scalability, and high availability of our customer database services on the Tessell Platform. · Minimum Requirements : · year ...


  • Coforge Hyderabad, India

    Role: Site Reliability Engineer · Location: Hyderabad · Work Mode: WFO · Experience: 6-10 yrs · Job Description: · Deep knowledge of version control. · Sound knowledge of operating Systems (like LINUX). · Should be aware of DevOps concepts and best practices. · CI/CD implementati ...


  • WaferWire Cloud Technologies Hyderabad, India

    Hi,This is Sundeep from · Waferwire Technologies · and we are hiring · Site Reliability Engineer (SRE). · Role: Site Reliability Engineer (SRE)Location: Hyderabad OfficeExperience: 5 to 8 Years · Role Description:Responsibilities:Implement and maintain robust DevOps practices wit ...


  • Shining Sheroes Bangalore/Hyderabad, India permanent

    Primary job responsibilities : · - Ability to operate and maintain various OS platforms, with a focus on debugging, automation, availability, performance, and scale. · - Diagnose and troubleshoot complex distributed systems. · - Work and collaborate across teams, such as OS, Appl ...


  • PURVIEW Hyderabad, India

    Role: Site Reliability EngineerLocation: HyderabadJob Type: Contract (Permanent to Purview Services) · NOTE: Client is looking for immediate joiners or 1 Month Notice Period candidates only. · Job Description: · Primary job responsibilitiesAbility to operate and maintain various ...


  • Kiash Solutions LLP Hyderabad, India Full time

    Considering candidates with 7+ yrs of exp. · MUST be able to onboard in 0-15 days MAX 20 days. · Location - Hyderabad (work from Office) · Timings - IST · Budget - open to discussion · Please find the JD for reference. · Design and develop tools that will aide in improving reli ...


  • Virtusa Hyderabad, India

    Site Reliability engineer - CREQ188641 Description Position : SRE · Primary skills: devops CI/CD pipeline · Location: Hyderabad · Should have proficiency in understanding of application monitoring stack(Logs, Events, Metrics and Alerts) and ability to visualize and setup end-to- ...


  • Electronic Arts Hyderabad, India

    Pogo has been the leader in online casual games since 1998. Featuring a growing library of 60+ titles · spanning popular genres like Solitaire, Mahjong, Match 3, and more, Pogo exists to be the best · destination for online casual games. We strive to produce high-quality HTML5-po ...


  • Oriontek INC Hyderabad, India Full time

    The Role · You will be responsible for : · Gathering and evaluating user feedback. · Providing code documentation and other inputs to technical documents. · Supporting continuous improvement by investigating alternatives and new technologies and presenting these for architectur ...


  • Alter Domus Hyderabad, India

    ABOUT US · We are Alter Domus. Meaning "The Other House" in Latin, Alter Domus is proud to be home to 85% of the top 30 asset managers in the alternatives industry, and more than 5,000 professionals across 23 countries. · With a deep understanding of what it takes to succeed in ...


  • Ampcus Tech Pvt. Ltd Hyderabad, India Full time

    Job Title: Site Reliability Engineer (SRE) · Work Location: Hyderabad (work from office) · Type: Fulltime Permanent · Notice Period: 30 Days · Experience: 5+Years · JD: · Responsibilities: · Build and maintain our cloud platforms and support applications, demonstrating agile and ...


  • Wipro Hyderabad, India

    Requirement- SRE · Experience: 6+Years · Location : Pan India · Key responsibilities · Review Monitoring & alerts to provide recommendations for enhancement towards 360° coverage · Create dashboards, setup synthetic and real user monitoring, visualize large data sets with inter ...