Reliability Engineer - Bengaluru, India - ATDXT Pvt. Ltd.

    ATDXT Pvt. Ltd.
    ATDXT Pvt. Ltd. Bengaluru, India

    1 week ago

    Default job background
    Full time
    Description
    • Respond to the P1 incident quickly and assess its impact on the environment.
    • Investigation & leading technical troubleshooting of Infra/App environment
    • Strong knowledge on improving the reliability, availability, scalability and performance of the environment.
    • Proactively review monitoring systems and dashboards by analyzing the data to ensure infrastructure and applications are operating within expected parameters so that can resolve issues proving to be bottlenecks to keep your system efficient and reliable.
    • Address any critical alerts promptly and investigate the underlying causes.
    • Narrow down the issue if is related to virtual machine performance, resource contention, network connectivity, or any other relevant aspect of the environment.
    • Fine tuning of the environment
    • Capacity management
    • Performance optimization
    • Validate backups and recoverability of data.
    • Collaborate with Security team.
    • Participate in CAB meeting and inspect for any gaps.
    • Participate in post-incident reviews RCA and suggest improvements to prevent similar incidents from occurring in the future.
    • Identify opportunities for process automation, operational tasks to reduce manual effort.