- Design, deploy, and operate Kubernetes infrastructure for AI inference, research, and engineering workloads
- Set up and manage GPU and HPC-style compute environments
- Build and manage Linux-based compute environments
- Help architect bare metal, cloud, and hybrid infrastructure
- Own the reliability and operational health of infrastructure systems
- Improve deployment workflows, automation, and infrastructure-as-code practices
- Partner with ML engineers and researchers to translate workload requirements into designs
- Build tooling, documentation, and runbooks
Cloud ComputingKubernetesLinux+2 more