Sr. Deployment Engineer, AI Inference

Posted 3 months agoViewed

SF Bay Area, TorontoFull-TimeAI Chip Manufacturing

Company:Cerebras Systems

Location:SF Bay Area, Toronto, EST, PST

Languages:English

Seniority level:Senior, 5-7 years

Experience:5-7 years

Skills:

DockerPythonBashKubernetesGrafanaPrometheusLinux

Requirements:

5-7 years of experience in operating on-prem compute infrastructure or developing and managing complex AWS plane infrastructure for hybrid deployments Strong proficiency in Python for automation, orchestration, and deployment tooling Solid understanding of Linux-based systems and command-line tools Extensive knowledge of Docker containers and container orchestration platforms like K8S Familiarity with spine-leaf (Clos) networking architecture Proficiency with telemetry and observability stacks such as Prometheus, InfluxDB and Grafana Strong ownership mindset and accountability for complex deployments Ability to work effectively in a fast-paced environment

Responsibilities:

Deploy AI inference replicas and cluster software across multiple datacenters Operate across heterogeneous datacenter environments Maximize capacity allocation and optimize replica placement using constraint-solver algorithms Operate bare-metal inference infrastructure while supporting transition to K8S-based platform Develop and extend telemetry, observability and alerting solutions Develop and extend a fully automated deployment pipeline Translate technical and customer needs into actionable requirements Stay up to date with the latest advancements in AI compute infrastructure