Experience operating production cloud services at scale (monitoring, alerting, incident response, post-mortems, continuous improvement). Strong debugging skills across distributed systems. Experience with observability tools (Prometheus, Grafana, OpenTelemetry, distributed tracing). Experience building and operating controllers that interact with the Kubernetes API server. Comfort working directly with customers to resolve complex technical issues. Responsibility and ownership for solving problems. Excellence in work, continuous skill improvement. Empathy for customers and focus on reliability and debuggability. Clear communication and effective collaboration.