Director of Cloud Operations
New
United StatesFull-TimeDirector
Salary200000 - 228000 USD per year
Apply NowOpens the employer's application page
Job Details
- Experience
- 10+ years
- Required Skills
- AWSKubernetesCI/CDTerraformMicroservices
Requirements
- 10+ years of experience in cloud infrastructure, DevOps, or Site Reliability Engineering, including leadership of CloudOps or SRE teams.
- Proven experience operating and scaling multi-region, customer-facing SaaS platforms in high-availability environments.
- Strong hands-on expertise with AWS, Kubernetes (EKS), Terraform, CI/CD pipelines, and modern cloud-native architectures.
- Deep understanding of distributed systems, microservices architecture, and reliability engineering principles.
- Experience with observability platforms and incident management practices, including on-call operations and production support.
- Strong knowledge of SLO/SLI frameworks, system performance tuning, and operational best practices.
- Demonstrated ability to lead hybrid teams while balancing strategic leadership with hands-on technical contribution.
- Excellent collaboration and communication skills with the ability to influence across engineering, product, and security teams.
- Pragmatic, outcomes-driven leadership style with a focus on continuous improvement and measurable impact.
Responsibilities
- Own the availability, performance, scalability, and resilience of a multi-region AWS cloud platform supporting large-scale SaaS services.
- Define and drive reliability engineering practices, including SLIs/SLOs, error budgets, and proactive system improvement initiatives.
- Lead incident management processes, including on-call rotations, escalation workflows, and post-incident reviews to reduce MTTR and improve system recovery.
- Oversee architecture and operational strategy for microservices, Kubernetes (EKS), and serverless workloads to ensure scalability and fault tolerance.
- Advance observability practices using modern monitoring tools to deliver actionable insights across infrastructure and application layers.
- Drive operational efficiency through automation, CI/CD optimization, infrastructure-as-code practices, and AI-assisted operational workflows.
- Lead cost optimization efforts across cloud environments while maintaining performance, reliability, and security standards.
- Manage and develop a distributed CloudOps engineering team, fostering accountability, technical excellence, and continuous learning.
- Ensure stable operations of hybrid environments, including legacy systems hosted in private data centers alongside modern cloud infrastructure.
View Full Description & ApplyYou'll be redirected to the employer's site