Site Reliability Engineer
New
S
SupabaseDatabase Platform
We hire globally. We believe you can do your best work from anywhere.Full-TimeSenior
Salary not disclosed
Apply NowOpens the employer's application page
Job Details
- Experience
- 7+ years
- Required Skills
- AWSKubernetesPostgresTerraform
Requirements
- 7+ years of experience in SRE, production engineering, or reliability-focused roles.
- Strong background in shaping SRE practices and driving cross-team adoption.
- Software engineering mindset with experience writing code and building tools.
- Hands-on experience defining and operationalizing SLOs/SLIs at scale.
- Deep experience with incident response and postmortem facilitation.
- Proficiency with cloud infrastructure (AWS preferred).
- Experience with infrastructure-as-code (Pulumi preferred, Terraform/CDK acceptable).
- Experience working with large-scale multi-tenant systems.
- Ability to influence without authority in distributed organizations.
- Experience in asynchronous or globally distributed team environments.
Responsibilities
- Partner with service teams to define meaningful SLIs and SLOs grounded in customer experience.
- Build error budget policies that turn reliability metrics into engineering decisions.
- Own and evolve the Operational Readiness Review (ORR) process for new services and major changes.
- Strengthen the incident-to-improvement pipeline by connecting postmortem findings to systemic fixes.
- Act as a reliability expert for architecture reviews, failure mode analysis, and resilience design.
- Identify operational toil and build or advocate for automation to eliminate it.
- Help teams design sustainable on-call practices including alert quality and runbook coverage.
View Full Description & ApplyYou'll be redirected to the employer's site