Senior Site Reliability Engineer

New
F
Flip GmbHResearch & Development
Remote (Europe)Full-TimeSenior
Salary not disclosed
Apply NowOpens the employer's application page

Job Details

Languages
English
Experience
5+ years
Required Skills
PythonKubernetesAzureGoCI/CDTerraform

Requirements

  • 5+ years of experience as an SRE, Platform, DevOps, Infrastructure, Cloud, or Backend Engineer
  • Proven track record building and operating high-throughput, highly available production systems
  • Deep production-level experience with Kubernetes on any Hyperscaler
  • Strong experience with modern observability stacks (e.g., Prometheus, Mimir, VictoriaMetrics, Dash0, Loki, ELK)
  • Clear point of view on SLIs, SLOs, and error budgets
  • Solid software development skills in Go or Python
  • Hands-on experience with Infrastructure as Code (Pulumi, OpenTofu, Terraform)
  • Experience with GitOps (e.g., ArgoCD) and CI/CD pipeline design
  • Demonstrated ability to lead infrastructure initiatives from design to production
  • Experience mentoring engineers
  • Business-fluent English
  • Willingness to participate in on-call rotations

Responsibilities

  • Co-own architecture and evolution of cloud infrastructure on Azure and Kubernetes
  • Define resilience strategies including scaling, zero-downtime deployments, and disaster recovery
  • Evolve observability stack using Loki, Grafana, Tempo, and Mimir
  • Improve Infrastructure as Code platform to enable self-service
  • Lead platform-related major incidents and facilitate blameless post-mortems
  • Mentor teammates, run RFCs, and conduct design reviews
  • Partner with the squad to define the platform roadmap
View Full Description & ApplyYou'll be redirected to the employer's site
View details
Apply Now