- Ensure reliability, uptime, and performance across GCP environments.
- Implement SRE and DevOps best practices with strong focus on automation and scalability.
- Build and optimize CI/CD pipelines using GCP-native tools.
- Lead observability initiatives using Grafana, Prometheus, Stackdriver.
- Troubleshoot production incidents and deliver root-cause fixes.
- Apply Infrastructure as Code (Terraform, Deployment Manager).
- Partner with cross-functional teams to maintain platform stability.
- Champion a proactive, blameless incident management culture.
- Drive continuous improvement through emerging cloud and automation technologies.