Site Reliability Engineer

Duvo IncAI Operations Platform

EU/UK BasedFull-TimeMiddle

Salary110,000 - 220,000 EUR per year

Apply NowOpens the employer's application page

Job Details

Required Skills: DockerGCPKubernetesGrafanaPrometheusTerraformDistributed Systems

Requirements

Extensive experience designing and operating large-scale distributed systems.
Solid understanding of security best practices including KMS encryption and WAF configuration.
Proven capability in building observability platforms and managing incident response workflows.
Deep expertise in Infrastructure as Code (IaC) tools and container orchestration.
Strong automation skills and a drive to eliminate manual runbooks.
Demonstrated ability to own projects from proposal to production.
Capacity to make high-judgment decisions regarding reliability investments and trade-offs.
Experience with GCP, Kubernetes, or similar cloud-native environments.
Familiarity with multi-tenant isolation or sandboxed execution environments.

Responsibilities

Own platform reliability, infrastructure, observability, and incident response.
Manage and scale sandbox infrastructure for AI agents.
Design and configure monitoring, alerting, and observability pipelines.
Lead structured incident responses and drive permanent root-cause fixes.
Automate infrastructure using IaC and container orchestration.
Inherit and maintain existing infrastructure (Terraform, OpenTelemetry, Prometheus/Grafana).
Collaborate with AI Platform Engineers to secure and isolate tenant workloads.

View Full Description & ApplyYou'll be redirected to the employer's site

About the Company

Duvo Inc

View Company Profile

Similar Jobs

Senior Site Reliability Engineer (SRE)

The Investigo Group

Remote -UK (possible paid occasional travel to TIG Secure site locations as required)Full-Time

View Job

Senior Site Reliability Engineer

Wikimedia Foundation

Please note that we are currently able to hire in the following: US States: Arizona, California, Colorado, Connecticut, District of Columbia*, Florida, Georgia, Idaho, Illinois, Indiana, Iowa, Maryland, Massachusetts, Michigan, Minnesota, Missouri, New Jersey, New Mexico, New York, North Carolina, Ohio, Oklahoma, Oregon, Pennsylvania, Puerto Rico*, Rhode Island, Tennessee, Texas, Utah, Vermont, Virginia, Washington, West Virginia, Wisconsin and Wyoming (*US Territory or Federal District) Countries: Brazil, Canada, Colombia, Germany, Ghana, India, Indonesia, Italy, Kenya*, Mexico, Morocco, Netherlands, Poland, Singapore*, South Africa, Spain, Switzerland and the United Kingdom.Full-Time

116,633 - 181,243 USD per year

View Job

Senior Site Reliability Engineer

Wikimedia Foundation

Please note that we are currently able to hire in the following: US States: [list of states] Countries: Brazil, Canada, Colombia, Germany, Ghana, India, Indonesia, Italy, Kenya, Mexico, Morocco, Netherlands, Poland, Singapore, South Africa, Spain, Switzerland and the United Kingdom.Full-Time

113,082 - 175,725 USD per year

View Job