Senior Site Reliability Engineer
New
F
Flip GmbHAI Employee Experience
Remote (Europe)Full-TimeSenior
Salary not disclosed
Apply NowOpens the employer's application page
Job Details
- Languages
- English
- Experience
- 5+ years
- Required Skills
- PythonKubernetesAzureGoTerraform
Requirements
- 5+ years of hands-on experience in SRE, Platform, DevOps, or Infrastructure engineering.
- Proven track record building/operating high-throughput, high-availability systems.
- Production-level experience with Kubernetes on Hyperscalers.
- Experience with modern observability stacks (e.g., Prometheus, Mimir, VictoriaMetrics, Loki).
- Solid software development skills in Go (preferred) or Python.
- Hands-on experience with Infrastructure as Code (Pulumi, OpenTofu, Terraform).
- Experience with GitOps (e.g., ArgoCD) and CI/CD pipeline design.
- Ability to lead complex infrastructure initiatives from design to production.
- Experience mentoring engineers.
- Fluent English communication skills.
- Willingness to participate in on-call rotations.
Responsibilities
- Co-own the architecture and evolution of cloud infrastructure on Azure and Kubernetes.
- Define resilience strategy including global scaling, zero-downtime, and disaster recovery.
- Improve observability stack foundations (Loki, Grafana, Tempo, Mimir).
- Develop self-service Infrastructure as Code platforms.
- Lead platform-related major incidents and drive post-mortems.
- Mentor team members and conduct RFCs/design reviews.
- Partner with the squad to define platform roadmaps.
View Full Description & ApplyYou'll be redirected to the employer's site