Site Reliability Engineer

New

SupabaseDatabase Platform

We hire globally. We believe you can do your best work from anywhere.Full-TimeSenior

Salary not disclosed

Apply NowOpens the employer's application page

Job Details

7+ years of experience in SRE, production engineering, or reliability-focused roles.
Strong background in shaping SRE practices and driving cross-team adoption.
Software engineering mindset with experience writing code and building tools.
Hands-on experience defining and operationalizing SLOs/SLIs at scale.
Deep experience with incident response and postmortem facilitation.
Proficiency with cloud infrastructure (AWS preferred).
Experience with infrastructure-as-code (Pulumi preferred, Terraform/CDK acceptable).
Experience working with large-scale multi-tenant systems.
Ability to influence without authority in distributed organizations.
Experience in asynchronous or globally distributed team environments.

Partner with service teams to define meaningful SLIs and SLOs grounded in customer experience.
Build error budget policies that turn reliability metrics into engineering decisions.
Own and evolve the Operational Readiness Review (ORR) process for new services and major changes.
Strengthen the incident-to-improvement pipeline by connecting postmortem findings to systemic fixes.
Act as a reliability expert for architecture reviews, failure mode analysis, and resilience design.
Identify operational toil and build or advocate for automation to eliminate it.
Help teams design sustainable on-call practices including alert quality and runbook coverage.

View Full Description & ApplyYou'll be redirected to the employer's site