Site Reliability Engineer

New
S
SupabaseDatabase Platform
We hire globally. We believe you can do your best work from anywhere.Full-TimeSenior
Salary not disclosed
Apply NowOpens the employer's application page

Job Details

Experience
7+ years
Required Skills
AWSKubernetesPostgresTerraform

Requirements

  • 7+ years of experience in SRE, production engineering, or reliability-focused roles.
  • Strong background in shaping SRE practices and driving cross-team adoption.
  • Software engineering mindset with experience writing code and building tools.
  • Hands-on experience defining and operationalizing SLOs/SLIs at scale.
  • Deep experience with incident response and postmortem facilitation.
  • Proficiency with cloud infrastructure (AWS preferred).
  • Experience with infrastructure-as-code (Pulumi preferred, Terraform/CDK acceptable).
  • Experience working with large-scale multi-tenant systems.
  • Ability to influence without authority in distributed organizations.
  • Experience in asynchronous or globally distributed team environments.

Responsibilities

  • Partner with service teams to define meaningful SLIs and SLOs grounded in customer experience.
  • Build error budget policies that turn reliability metrics into engineering decisions.
  • Own and evolve the Operational Readiness Review (ORR) process for new services and major changes.
  • Strengthen the incident-to-improvement pipeline by connecting postmortem findings to systemic fixes.
  • Act as a reliability expert for architecture reviews, failure mode analysis, and resilience design.
  • Identify operational toil and build or advocate for automation to eliminate it.
  • Help teams design sustainable on-call practices including alert quality and runbook coverage.
View Full Description & ApplyYou'll be redirected to the employer's site
View details
Apply Now