Senior Machine Learning Engineer

New
Remote-friendly within AustraliaFull-TimeSenior
Salary not disclosed
Apply NowOpens the employer's application page

Job Details

Required Skills
Cloud ComputingKubernetesPyTorchDistributed Systems

Requirements

  • Strong experience in training pipelines, distributed systems, or large-scale AI infrastructure
  • Strong experience with Kubernetes and containerized workloads
  • Experience with distributed frameworks such as Ray or PyTorch distributed training
  • Familiarity with cloud/infrastructure services for high-performance AI (e.g., high-performance storage, HPC environments, fast interconnects)
  • Experience with services such as FSx and EFA
  • Strong sense of ownership and ability to work on cross-cutting problems
  • Focus on scalability, reliability, usability, and developer experience

Responsibilities

  • Contribute to the evolution of Canva’s unified training platform for AI training workloads
  • Improve reliability, observability, debugging, and operational support for training systems
  • Design and build the platform capabilities that enable better scheduling at scale, including resource allocation, priority management, and quota management
  • Collaborate closely with research scientists, ML engineers, product teams, and cloud/infrastructure teams
  • Contribute to system design and architecture decisions across Canva’s AI Platform
  • Help shape platform roadmap and priorities based on user pain points and long-term maturity
  • Mentor engineers and share best practices in AI systems and infrastructure
View Full Description & ApplyYou'll be redirected to the employer's site
View details
Apply Now