Compute Operations Engineer - Hyderabad, India - Chubb

    Chubb
    Default job background
    Insurance
    Description

    Compute Operations Engineer

    Position Summary:

    Chubb is seeking an experienced Compute Operations Engineer to join our growing Distributed IT Operations team, providing support for VMware vSphere environments servicing both Virtual Servers (VSI) and Virtual Desktop (VDI) systems. The Compute Operations Engineer would require a minimum of 5+ years of directly related experience supporting VMware vSphere environments.

    Seeking motivated technical professionals who specialize in providing support for enterprise level VMware Cloud Foundation (VCF/vSphere) infrastructure and related services, focusing on stability and performance.The position holder will be primarily focused on performing operational functions of Chubb's Virtual Infrastructure environment; activities include but not limited to the support and maintenance of multiple VCF & vCenter environments, monitoring, alerting, availability, VM resource and performance management.

    Primary Job Responsibilities:

    Investigation and diagnosis of incidents and problems relating to VMware Cloud Foundation

    (VCF), vRealize Suite, and vSphere Infrastructure and Virtual Machines.

    Analyze, authorize, and implement VM resource (Compute and Storage) requests.

    Participate in a follow the sun operational model supporting and maintaining the VMware Virtual Infrastructure for North and Latin America systems (expanding globally throughout the year).

    Provide on-call and after-hours support to address incidents, maintain infrastructure and support operational efforts.

    Provide training and mentorship to junior team members. Train team members in best practices and act as subject matter expert and escalation contact for infrastructure related issues.

    Proactively ensure the highest levels of systems and infrastructure availability. Perform daily system monitoring, verifying the integrity and availability of systems and key processes, reviewing system and application logs.

    Define, monitor and enforce the capacity and performance standards, metrics and thresholds and help build the instrumentation to effectively manage the environment.

    Work closely with network, security, development, application and support teams in the implementation of infrastructure components that support emerging technologies and applications.

    Automate operational, monitoring, and integrity verification processes (e.g., runbooks) for hardware, server, and system resources and processes.

    Implement common preventive maintenance practices for Dell PowerEdge and Cisco UCS hardware and VMware software suites.

    Ensure that system improvements and changes are implemented and monitor effects of the modifications.

    Help Identify and manage risks within the Virtual Infrastructure.

    Produce and maintain operational documentation, understanding and tools to effectively manage the VMware Cloud Foundation, vRealize (Aria) Suite, and vSphere platform.

    Assist in the development and execution of disaster recovery plans.

    Participate in change, incident, and problem management.

    Knowledge, Skills and Competencies:

    Extensive and recent support and engineering experience in hardware, operating systems, storage, and virtualization technologies in a global multi-data center enterprise organization.

    Advanced knowledge and experience with VMware hypervisors (VMware vSphere 7.x, 8.x) and VMware Cloud Foundation.

    Advanced knowledge and experience with server hardware architecture, configuration, and troubleshooting Cisco UCS and Dell PowerEdge MX infrastructure.

    Working knowledge and experience with VMware Aria (vRealize) Suite, including Aria Operations (vROPS), Aria Operations for Logs (Log Insight), and Aria Automation (vRA).

    Strong critical thinking and problem-solving, and the ability to debug complex-cross systems problems, and document root cause including remediation and detection.

    Demonstrates knowledge of a broad range of technology towers i.e.: Storage, Virtualization, Intel, Networking, Data Center Migration and Disaster Recovery

    Knowledge and experience with Windows server operating systems (Windows 2019, 2022).

    Knowledge of risk and controls landscape, ensuring company-wide standards are met.

    Fundamental understanding of network security practices

    Have hands on experience with PowerShell or Python and APIs

    Ability to collaborate with different technology towers to achieve common goals.

    Ability to manage multiple streams of work simultaneously.

    Excellent verbal and written communication skills.