Sre- Cloud Native - Pune, India - Ellicium

Ellicium
Ellicium
Verified Company
Pune, India

1 month ago

Deepika Kaur

Posted by:

Deepika Kaur

beBee Recuiter


Description
What are we about? At Ellicium be ready for contagious excitement, worthy challenges and enriching learning experience every day. We trust in the process of failing fast and learning fast.

You will find 'Ellicians' putting their heart to get the perfect dish during our monthly potlucks or arguing over best Nolan movie during lunch breaks.

It is all about having fun We are passionate people with immense love for what we do and we are proud of what we have created.

Our Key Values AMBITION Empowering individuals to reach new heights and achieve their ambitious goals. TEAMWORK Collaborating to achieve extraordinary results and support each other's success. GROWTH Providing continuous learning and development opportunities, unlocking your full potential. COMMUNITY Building a supportive and inclusive community where we make a positive impact together. Perks and Benefits Businesses need to make better and faster decisions by analyzing data to stay competitive and future-ready Targeted Bonus Program Health Care Competitive Salary


SRE- Cloud Native:


Primary skills:
Containers, Kubernative, Devops, Python, Golang, TDD, Linux


Years of experience: 5-7+


The Service Operations team at "Product Platform" Systems is responsible for building and operating the platform and infrastructure that enables us to deliver our groundbreaking capabilities to enterprise customers.

As a site reliability engineer on this team, you will work closely alongside the platform engineering team to deploy and manage our Kubernetes based platform at a global scale.

You will lead multiple initiatives to enhance our capabilities and provide a reliable, scalable service for customers, in a hybrid deployment pattern.


How you will make an impact:

  • Assume broad responsibilities for successful delivery of our "Product Platform" services in a hybrid model including but not limited to, deployment, configuration, integrations, and ongoing operations
  • Deploy, administer, manage multiple Kubernetes clusters, both onprem and in private cloud environments
  • Develop and continuously improve platform capabilities for observability, monitoring, notifications, logging, tracing and continuous delivery with reduced toil
  • Develop standard solutions that enable consistency in service delivery and proactively engage with multiple crossfunctional teams to solve problems that impact service levels.
  • Determine and set SLOs for the service and build the process and tools to measure and implement the SLOs, prevent recurring problems and undesirable service conditions.
  • Participate in oncall rotation responsibilities

Basic Qualifications:


  • Bachelors and/or Masters in CS /EE or related field
  • 5+ years of handson experience as an SRE with focus on cloud native technologies
  • Handson experience deploying, managing and troubleshooting Kubernetes clusters and components.
  • Strong experience configuring and administering Linux systems in cloud/Saas production environments.
  • Systematic problemsolving approach to troubleshooting, and the desire to solve the root cause of common problems in 24×7 environments
Preferred Qualifications

  • Software programming experience in one or more languages including Go/ Python
  • Experience delivering infrastructure as code
  • Ansible, Terraform, Git, Jenkins, Helm, ArgoCD.
  • Good understanding of DNS, DHCP, LDAP, NFS, Kerberos, PAM, PXE, SNMP, SSH, HTTP/S, NTP, troubleshooting network performance issues
  • Experience with monitoring and logging systems such as Prometheus, Grafana, Nagios, ELK etc. and the ability to identify new technologies as appropriate
  • Experience tuning and optimizing storage solutions including Object Storage and NFS.
  • Knowledge of virtualization, multiple hypervisor technologies as well as cloud computing technologies like AWS, Azure, GCP.
  • Configuration and maintenance of web servers, load balancers, databases, storage systems and messaging systems
  • Good understanding of test driven development, continuous integration and delivery
  • A passion to design for high availability and scale, with the discipline and desire for extensive automation.
  • Strong communication skills with the ability and willingness to work with diverse teams, and customers, across multiple time zones.

More jobs from Ellicium