Senior Site Reliability Engineer

United StatesFull-TimeSenior

Salary not disclosed

Apply NowOpens the employer's application page

Job Details

Build and maintain central monitoring and alerting layer for AI applications and pipelines
Define and implement SLIs, alerts, and operational dashboards
Manage incidents including triage, coordination, root cause analysis, and prevention
Standardise telemetry across systems including latency, throughput, and failures
Optimise CI CD pipelines and introduce quality gates for reliability
Work closely with engineering teams to reduce recurring issues and improve stability

View Full Description & ApplyYou'll be redirected to the employer's site