5+ years in Site Reliability, Cloud, or DevOps Engineering Experience designing and deploying large scale systems, multi-vendor platforms and globally distributed infrastructure Proven experience managing cloud infrastructure in AWS (multi-account, VPC, EC2, EKS) and Kubernetes at scale Strong hands-on experience with IaC and automation (Terraform, Ansible, or similar) Familiarity with CI/CD pipelines and release automation (GitLab preferred, Jenkins acceptable) Deep understanding of monitoring and observability using Datadog (or equivalent) Experience with incident management, on-call participation, escalation, and structured postmortems Scripting skills in Python, Bash, Java or equivalent Basic knowledge of Java- or .Net-based development Strong English communication skills