Senior Site Reliability Developer

New

United StatesFull-TimeSenior

Salary not disclosed

Apply NowOpens the employer's application page

Job Details

Proven experience operating large-scale distributed systems
Strong background in SRE practices and production operations
Hands-on experience with Kubernetes-based services
Experience building and maintaining cloud-native infrastructure
Proficiency with automation, CI/CD pipelines, and infrastructure as code
Experience implementing and managing observability tools (metrics, logging, tracing, alerting)
Strong problem-solving skills for incident response and root-cause analysis

Operate and improve large-scale distributed systems powering Clinical AI Assistant services
Build automation that improves reliability, scalability, and operational efficiency
Improve observability across metrics, logging, tracing, and alerting
Participate in production operations, incident response, and root-cause analysis
Help build self-healing infrastructure and operational tooling
Support Kubernetes-based services and cloud-native infrastructure
Partner with software engineers to improve reliability before systems reach production
Contribute to CI/CD pipelines, infrastructure as code, and platform engineering standards
Learn and apply modern SRE practices for AI-powered healthcare systems

View Full Description & ApplyYou'll be redirected to the employer's site