Senior Site Reliability Engineer

New

ReplitSoftware Development

Remote - Europe; Secondary Locations: Remote - France, Remote - Ireland, Remote - Italy, Remote - Netherlands, Remote - United KingdomFull-TimeSenior

Salary not disclosed

Apply NowOpens the employer's application page

Job Details

4-8 years of experience in Site Reliability Engineering, DevOps, or Systems/Infrastructure Engineering.
Strong programming skills in Python, Go, or similar languages.
Deep understanding of distributed systems.
Experience with Kubernetes and cloud-native technologies.
Proven track record of implementing observability solutions.
Strong incident management and response leadership experience.
Experience with infrastructure as code and configuration management tools.

Design and implement observability solutions using modern tools.
Develop dashboards, metrics, and logging strategies for system health.
Architect and implement infrastructure automation using Terraform, Ansible, or Pulumi.
Design and maintain CI/CD pipelines.
Create self-healing systems for failure scenarios.
Define and implement SLOs and SLIs in collaboration with engineering teams.
Lead incident response and conduct post-mortems.
Build tools and processes to reduce MTTR.
Optimize infrastructure performance, capacity, and resource utilization.

View Full Description & ApplyYou'll be redirected to the employer's site