Lead Site Reliability Engineer

Hyderabad
Full-time

All about Zeta Suite :

Zeta is the world’s first and only Omni Stack for banks and fintechs. We are rethinking payments from core to the edge, led by the vision to augment the purpose of money and banking with technology. A single, modern software stack comprising processing, loans, customizable mobile and web apps, a fraud engine, and rewards for retail banking.

We are a new-age, high-growth startup (& a unicorn!) founded in 2015 by two visionary leaders, Bhavin Turakhia & Ramki Gaddipati, whose entrepreneurial legacy & excellence has put us on top of the global fintech ecosystem. Zeta counts amongst its customers over 10 banks and 25 fintechs across 8 countries - some of our notable clients include Sodexo - a leading issuer of employee benefits & rewards with over 30 million global users, and HDFC Bank - the 14th largest bank by market cap in the world. Learn more about our manifesto & beyond.

Life At ZetaAt Zeta, we want you to grow to be the best version of yourself by unlocking the great potential that lies within you. This is why our core philosophy is ‘People Must Grow.’ We recognize your aspirations; act as enablers by bringing you the right opportunities, and let you grow as you chase disruptive goals.

#LifeAtZeta is adventurous and exhilarating at the same time. You get to work with some of the best minds in the industry and experience a culture that values the diversity of thoughts. If you want to push boundaries, learn continuously and grow to be the best version of yourself, Zeta is the place to be! Explore the life at zeta

Zeta is an equal opportunity employer.

At Zeta, we are committed to equal employment opportunities regardless of job history, disability, gender identity, religion, race, marital/parental status, or another special status. We are proud to be an equitable workplace that welcomes individuals from all walks of life if they fit the roles and responsibilities.

Responsibilities:

Establish a SRE site and help build an effective, inclusive SRE team.
Provide technical leadership for the local team and work closely with partner team technical leads and cloud leadership.
Provide guidance to other team members on managing availability and performance of mission critical services, on building automation to prevent problem recurrence, and building automated responses for non-exceptional service conditions.
Manage execution of project priorities, deadlines, and deliverables.
Lead Incident Management during Incidents.
Responsible for driving MTTR as per the Incident SLA.
Responsible for having 100% coverage for various alerts covering Application, Infrasture, Security, Flows etc

Qualification:

6-10 years of experience in distributed systems, storage systems, or databases, algorithms and data structures and/or Unix/Linux systems internals (e.g., filesystems, system calls) and administration.
Experience designing, analyzing, and troubleshooting large-scale distributed systems.
Experience in MySQL or Postgres SQL in database.
Hands-on experience on operating with k8s and any cloud.
Excellent communication skills and a sense of ownership, with a systematic problem-solving approach

Skills

Linux/Unix
Data Structures and Algorithms
Troubleshooting
Distributed Systems
MySQL.
K8s

Education

Master's Degree
Bachelor's Degree

Job Information

Job Posted Date

Jun 07, 2023

Experience

5-10 Years

Compensation (Annual in Lacs)

₹ Market Standard

Work Type

Permanent

Type Of Work

8 hour shift