5+ years of experience in Platform Engineering, DevOps, or Site Reliability Engineering (SRE). Proven, hands-on experience building and managing production infrastructure with Terraform. Expert-level knowledge of Kubernetes architecture and operations at scale. Experience with high-performance compute (HPC) job schedulers, specifically Slurm. Experience managing bare metal infrastructure, including server provisioning and lifecycle management. Strong scripting and automation skills (e.g., Python, Go, Bash).