Staff Software Engineer - Databases SRE
New
G
Grafana LabsCloud Observability
This is a remote opportunity and we are looking for candidates from the UK, Sweden, Spain or Germany.Full-TimeSenior
Salary€109,709 - €131,651
Apply NowOpens the employer's application page
Job Details
- Experience
- 8+ years engineering experience, 4+ in SRE/CRE/production engineering
- Required Skills
- AWSPythonGCPJavaKubernetesAzureGoTerraformHelm
Requirements
- 8+ years of engineering experience.
- 4+ years in SRE, CRE, or production engineering.
- Strong Kubernetes experience in AWS, GCP, or Azure.
- Familiarity with infrastructure-as-code tooling (e.g., Helm, Terraform, Jsonnet).
- Technical leadership experience, including project management and mentoring.
- Experience operating multi-tenant systems in production.
- Strong experience designing and implementing SLOs.
- Experience with one or more programming languages (e.g., Go, Python, Java).
- Experience with Linux OS internals, networking, cloud storage, and scaling.
- Excellent problem-solving and troubleshooting skills.
- Experience with blame-free Incident Response and writing high-quality Post-Incident Reviews (PIRs).
Responsibilities
- Partner closely with product engineering squads in an embedded model.
- Own production reliability for high-SLA and complex customer environments.
- Design and implement automation to scale reliability practices and eliminate toil.
- Define and evolve per-tenant SLOs and reliability models.
- Serve as a primary escalation point and on-call for incidents.
- Lead customer-impacting incident response and post-incident reviews.
- Contribute to design docs and code reviews.
- Influence feature design to ensure production scalability and operability.
- Improve alert quality and reduce noisy escalations.
View Full Description & ApplyYou'll be redirected to the employer's site