Senior Site Reliability Engineer

New

GoDaddyIT Operations

BulgariaFull-TimeSenior

Salary41,000 - 61,000 EUR per year

Apply NowOpens the employer's application page

Job Details

6+ years of professional experience in Site Reliability Engineering or Platform Engineering
Deep hands-on expertise with Kubernetes and Docker in production environments
Advanced Linux troubleshooting skills covering kernel internals, TCP/IP, DNS, and load balancers
Proficiency in Python for production-grade automation and scripting
Working knowledge of Bash
Expertise in Ansible and at least one additional Infrastructure as Code tool such as Terraform or Pulumi
Hands-on mastery of Icinga, Prometheus, and Grafana
Understanding of large language models, embeddings, and basic machine learning pipelines

Design, implement, and operate scalable, highly available production services while diagnosing and resolving complex infrastructure, network, and application issues
Build and maintain alerting pipelines, dashboards, and SLO-driven monitoring strategies using Icinga, Prometheus, and Grafana
Lead incident response end-to-end — performing root-cause analysis, authoring blameless post-mortems, and driving corrective actions to closure
Develop and extend Infrastructure as Code coverage and build internal tooling that eliminates manual, repetitive operational work
Mentor SRE I and SRE II engineers through code reviews, debugging sessions, and knowledge-sharing talks
Apply LLM-driven log analysis, anomaly detection, and generative AI tools to accelerate incident response and runbook creation

View Full Description & ApplyYou'll be redirected to the employer's site