Senior Site Reliability Engineer

New
F
Flip GmbHAI Employee Experience
Remote (Europe)Full-TimeSenior
Salary not disclosed
Apply NowOpens the employer's application page

Job Details

Languages
English
Experience
5+ years
Required Skills
PythonKubernetesAzureGoTerraform

Requirements

  • 5+ years of hands-on experience in SRE, Platform, DevOps, or Infrastructure engineering.
  • Proven track record building/operating high-throughput, high-availability systems.
  • Production-level experience with Kubernetes on Hyperscalers.
  • Experience with modern observability stacks (e.g., Prometheus, Mimir, VictoriaMetrics, Loki).
  • Solid software development skills in Go (preferred) or Python.
  • Hands-on experience with Infrastructure as Code (Pulumi, OpenTofu, Terraform).
  • Experience with GitOps (e.g., ArgoCD) and CI/CD pipeline design.
  • Ability to lead complex infrastructure initiatives from design to production.
  • Experience mentoring engineers.
  • Fluent English communication skills.
  • Willingness to participate in on-call rotations.

Responsibilities

  • Co-own the architecture and evolution of cloud infrastructure on Azure and Kubernetes.
  • Define resilience strategy including global scaling, zero-downtime, and disaster recovery.
  • Improve observability stack foundations (Loki, Grafana, Tempo, Mimir).
  • Develop self-service Infrastructure as Code platforms.
  • Lead platform-related major incidents and drive post-mortems.
  • Mentor team members and conduct RFCs/design reviews.
  • Partner with the squad to define platform roadmaps.
View Full Description & ApplyYou'll be redirected to the employer's site
View details
Apply Now