Senior Machine Learning Engineer
New
Remote-friendly within AustraliaFull-TimeSenior
Salary not disclosed
Apply NowOpens the employer's application page
Job Details
- Required Skills
- Cloud ComputingKubernetesPyTorchDistributed Systems
Requirements
- Strong experience in training pipelines, distributed systems, or large-scale AI infrastructure
- Strong experience with Kubernetes and containerized workloads
- Experience with distributed frameworks such as Ray or PyTorch distributed training
- Familiarity with cloud/infrastructure services for high-performance AI (e.g., high-performance storage, HPC environments, fast interconnects)
- Experience with services such as FSx and EFA
- Strong sense of ownership and ability to work on cross-cutting problems
- Focus on scalability, reliability, usability, and developer experience
Responsibilities
- Contribute to the evolution of Canva’s unified training platform for AI training workloads
- Improve reliability, observability, debugging, and operational support for training systems
- Design and build the platform capabilities that enable better scheduling at scale, including resource allocation, priority management, and quota management
- Collaborate closely with research scientists, ML engineers, product teams, and cloud/infrastructure teams
- Contribute to system design and architecture decisions across Canva’s AI Platform
- Help shape platform roadmap and priorities based on user pain points and long-term maturity
- Mentor engineers and share best practices in AI systems and infrastructure
View Full Description & ApplyYou'll be redirected to the employer's site