Apply

Site Reliability Engineer (SRE) (m/w/d)

Posted 2024-11-07

View full description

📍 Location: Germany and within Europe

🔍 Industry: Technology / Employee Communication

🏢 Company: Flip App

🗣️ Languages: English, German

🪄 Skills: AWSPythonSoftware DevelopmentGCPKotlinKubernetesAzureGoGrafanaPrometheusCI/CD

Requirements:
  • Experience in operating and scaling cloud infrastructures (Azure, AWS, GCP).
  • Deep knowledge of Kubernetes and container solutions.
  • Interest in observability tools such as Prometheus, VictoriaMetrics, Mimir, Loki, ELK.
  • Familiarity with SLO, error budget, and Apdex.
  • Good knowledge of software development languages like Go, Python, Kotlin.
  • Business fluent in English; German is a plus.
  • Experience with infrastructure as code tools (e.g., Pulumi, OpenTofu) and automation tools (e.g., Ansible, Chef).
Responsibilities:
  • Ensure the availability, performance, and scalability of the infrastructure.
  • Promote practices like CI/CD, observability, and developer experience.
  • Shape goals for scalable systems and observability.
  • Expand cloud infrastructure and Kubernetes cluster.
  • Ensure resilience and safety through zero-downtime rollouts.
  • Create observability through the further development of the LGTM stack.
  • Design, develop, and optimize infrastructure as code using Pulumi in Go.
Apply