Senior Manager, Site Reliability Engineering

New
C
Clover HealthHealthcare Technology
Remote - USA, US, HK, NZFull-TimeManager
Salary$187,000 — $243,000 USD
Apply NowOpens the employer's application page

Job Details

Experience
6+ years managing an SRE team and 10+ years of hands-on SRE or infrastructure engineering experience
Required Skills
PostgreSQLPythonGCPKubernetesGoGrafanaPrometheusCI/CDTerraformGitHub Actions

Requirements

  • 6+ years managing an SRE team.
  • 10+ years of hands-on SRE or infrastructure engineering experience.
  • Deeply comfortable with Kubernetes and GCP (GKE, Cloud SQL, Pub/Sub, GCS).
  • Experience with Terraform, Helm, and ArgoCD.
  • Proficiency with PostgreSQL.
  • Experience with Prometheus and Grafana.
  • Strong programming skills in Python and/or Go.
  • Experience with CI/CD pipelines (GitHub Actions).
  • Track record of building or improving developer tooling and automation.
  • Sound build vs. buy judgment for infrastructure solutions.
  • Experience leading teams across multiple time zones.

Responsibilities

  • Lead and grow our SRE team of ~10 engineers, including hiring, retention, career development, and performance management across multiple time zones.
  • Build strategic partnerships with product engineering pillars — shifting SRE from reactive, ticket-based support to proactive co-ownership of reliability outcomes.
  • Scale our multi-tenant infrastructure to support new customer onboarding and growing patient populations.
  • Own cloud cost management and FinOps practices, building frameworks that balance cost control with reliability and performance.
  • Champion developer self-service and platform engineering.
  • Establish SLOs/SLIs for critical services and improve alert quality so every page is meaningful.
  • Ensure the SRE team is fully leveraging AI tooling in their workflows.
View Full Description & ApplyYou'll be redirected to the employer's site
$187,000 — $243,000 USD
Apply Now