Staff Software Engineer - Databases SRE

New
G
Grafana LabsCloud Observability
This is a remote opportunity and we are looking for candidates from the UK, Sweden, Spain or Germany.Full-TimeSenior
Salary€109,709 - €131,651
Apply NowOpens the employer's application page

Job Details

Experience
8+ years engineering experience, 4+ in SRE/CRE/production engineering
Required Skills
AWSPythonGCPJavaKubernetesAzureGoTerraformHelm

Requirements

  • 8+ years of engineering experience.
  • 4+ years in SRE, CRE, or production engineering.
  • Strong Kubernetes experience in AWS, GCP, or Azure.
  • Familiarity with infrastructure-as-code tooling (e.g., Helm, Terraform, Jsonnet).
  • Technical leadership experience, including project management and mentoring.
  • Experience operating multi-tenant systems in production.
  • Strong experience designing and implementing SLOs.
  • Experience with one or more programming languages (e.g., Go, Python, Java).
  • Experience with Linux OS internals, networking, cloud storage, and scaling.
  • Excellent problem-solving and troubleshooting skills.
  • Experience with blame-free Incident Response and writing high-quality Post-Incident Reviews (PIRs).

Responsibilities

  • Partner closely with product engineering squads in an embedded model.
  • Own production reliability for high-SLA and complex customer environments.
  • Design and implement automation to scale reliability practices and eliminate toil.
  • Define and evolve per-tenant SLOs and reliability models.
  • Serve as a primary escalation point and on-call for incidents.
  • Lead customer-impacting incident response and post-incident reviews.
  • Contribute to design docs and code reviews.
  • Influence feature design to ensure production scalability and operability.
  • Improve alert quality and reduce noisy escalations.
View Full Description & ApplyYou'll be redirected to the employer's site
€109,709 - €131,651
Apply Now