Image Loading

Senior Site Reliability Engineer, Database Operations

Job Description

GitLab is an open core software company that develops the most comprehensive AI-powered DevSecOps Platform, used by more than 100,000 organizations. Our mission is to enable everyone to contribute to and co-create the software that powers our world. When everyone can contribute, consumers become contributors, significantly accelerating the rate of human progress. This mission is integral to our culture, influencing how we hire, build products, and lead our industry. We make this possible at GitLab by running our operations on our product and staying aligned with our values. Learn more about Life at GitLab.

Senior Site Reliability Engineer, Database Operations

GitLab is an open core software company that develops the most comprehensive AI-powered DevSecOps Platform, used by more than 100,000 organizations. Our mission is to enable everyone to contribute to and co-create the software that powers our world. When everyone can contribute, consumers become contributors, significantly accelerating the rate of human progress. This mission is integral to our culture, influencing how we hire, build products, and lead our industry. We make this possible at GitLab by running our operations on our product and staying aligned with our values. Learn more about Life at GitLab.

Responsibilities
Site Reliability Engineers (SREs) are responsible for keeping all user-facing services and other GitLab production systems running smoothly 24x7x365. SREs are a blend of pragmatic operators and software craftspeople that apply sound engineering principles, operational discipline, and mature automation to our environments and the GitLab codebase. We specialize in systems, whether it be networking, the Linux kernel, or some more specific interest in scaling, algorithms, or distributed systems, along these functions:

  • Design, build, and maintain ClickHouse and PostgreSQL clusters to support high-demand, enterprise-scale workloads.
  • Provision and Orchestrate cloud infrastructure using configuration management tools (Ansible, Chef), IaC (Terraform) and the Kubernetes ecosystem (Helm charts, Operators) and distributed consensus (etcd) in GCP
  • Design and implement enterprise-grade, high-availability ClickHouse solutions with ClickHouse Keeper, sharding, and replication, optimized for large-scale and dynamic datasets.
  • Optimize and scale high-transaction PostgreSQL clusters with Patroni and streaming replication for GitLab’s core applications on GCP
  • Build and maintain early warning systems, monitoring, and alerting tools (e.g., Prometheus/Grafana) to predict capacity needs, monitor query latency and replication lag, and ensure resource optimization across platforms.
  • Enable cross-database integrations and workflows, such as ClickHouse-to-PostgreSQL data federation, CDC, and logical replication, to support hybrid analytics.
  • Respond to platform alerts, user emergencies, and support requests while ensuring strict adherence to SLOs, including during SRE on-call rotations.
  • Enhance infrastructure security by implementing and updating measures that protect GitLab’s systems and ensure compliance with regulatory requirements (e.g., GDPR, FedRAMP, SOC2, ISO).
  • Partner with internal and external compliance assessors as Subject Matter Experts during certifications and recertifications.
  • Collaborate with engineering teams to address architectural bottlenecks, plan service rollouts and migrations, and shape the future roadmap while maintaining strong operational readiness.

The Database Operations Engineer at GitLab is responsible for the Building, Running and Owning and Evolving of the entire lifecycle of database engines for GitLab.com. See Database Operations team page for details.

Skills

  • SRE
  • GCP
  • IaC
  • Kubernetes
  • Subject Matter Experts

Education

  • Master's Degree
  • Bachelor's Degree

Job Information

Job Posted Date

Apr 01, 2025

Experience

5-10 Years

Compensation (Annual in Lacs)

Best in the Industry

Work Type

Permanent

Type Of Work

8 hour shift

Category

Information Technology

Copyright © 2022 All Rights Reserved. Saas Talent