Senior Site Reliability Engineer
New
G
GoDaddyIT Operations
BulgariaFull-TimeSenior
Salary41,000 - 61,000 EUR per year
Apply NowOpens the employer's application page
Job Details
- Experience
- 6+ years
- Required Skills
- DockerPythonBashKubernetesPrometheusLinuxTerraformAnsible
Requirements
- 6+ years of professional experience in Site Reliability Engineering or Platform Engineering
- Deep hands-on expertise with Kubernetes and Docker in production environments
- Advanced Linux troubleshooting skills covering kernel internals, TCP/IP, DNS, and load balancers
- Proficiency in Python for production-grade automation and scripting
- Working knowledge of Bash
- Expertise in Ansible and at least one additional Infrastructure as Code tool such as Terraform or Pulumi
- Hands-on mastery of Icinga, Prometheus, and Grafana
- Understanding of large language models, embeddings, and basic machine learning pipelines
Responsibilities
- Design, implement, and operate scalable, highly available production services while diagnosing and resolving complex infrastructure, network, and application issues
- Build and maintain alerting pipelines, dashboards, and SLO-driven monitoring strategies using Icinga, Prometheus, and Grafana
- Lead incident response end-to-end — performing root-cause analysis, authoring blameless post-mortems, and driving corrective actions to closure
- Develop and extend Infrastructure as Code coverage and build internal tooling that eliminates manual, repetitive operational work
- Mentor SRE I and SRE II engineers through code reviews, debugging sessions, and knowledge-sharing talks
- Apply LLM-driven log analysis, anomaly detection, and generative AI tools to accelerate incident response and runbook creation
View Full Description & ApplyYou'll be redirected to the employer's site