Engineering Manager - Machine Learning
New
This is a remote position based in Toronto, Canada.Full-TimeManager
Salary210,070 - 282,851 CAD per year
Apply NowOpens the employer's application page
Job Details
- Required Skills
- DockerPythonGCPKubernetesPyTorchMLOpsDistributed Systems
Requirements
- Hands-on experience as a tech lead or manager in infrastructure, MLOps, and distributed systems.
- Experience in machine learning, orchestration, and agentic systems.
- Strong leadership, mentoring, and coaching skills.
- Ability to partner and communicate across organizational boundaries.
- Experience with ML infrastructure and model deployment.
- Proficiency in distributed compute and GPU optimization.
- Experience with the following stack: Python, PyTorch, Docker, Kubernetes, Ray, Weights & Biases, Prefect, BigQuery, Postgres, GCP, CUDA.
Responsibilities
- Enable AI/ML, LLM, and Agentic Systems teams for scale.
- Build and operate platforms for training, deploying, and monitoring models.
- Work with researchers and ML engineers to understand infrastructure needs.
- Mentor, coach, and sponsor team members to deliver impact and professional growth.
- Partner with ML research, platform engineering, and business teams.
- Optimize GPU cluster utilization and implement agentic orchestration.
- Establish company-wide MLOps standards.
View Full Description & ApplyYou'll be redirected to the employer's site