Applyπ Australia
π§ Full-Time
π AI
π’ Company: Leonardo.Ai
- Proven experience deploying diffusion-based models (e.g. latent diffusion, LoRA, ControlNet) into production environments, ideally across dozens or hundreds of GPUs.
- Proficiency in Python and PyTorch, with a focus on optimised inference, model tuning, and memory-efficient execution.
- Familiarity with model deployment tools and practices (e.g. model registries, workflow orchestration, CI/CD for ML).
- Comfort with performance trade-offs, debugging large-scale systems, and delivering improvements fast.
- Experience working in fast-moving, cross-functional teams shipping real-world AI products.
- Ability to pivot quickly between deep technical work, product needs, and cross-functional alignment.
- Build and maintain robust production pipelines that deploy generative models across multiple services, each with 100s of GPUs
- Contribute to one of the worldβs highest-throughput GenAI systems, generating millions of images and videos daily.
- Utilise quantisation, compilation, caching, distillation, and multi-GPU parallelism to enhance throughput, latency, and stability.
- Collaborate closely with researchers to productionise new capabilities, such as LoRAs, ControlNets, and custom architectures.
- Tackle a wide range of problems β from orchestrating massive multi-GPU video pipelines to optimising end-to-end latency and hardening scalable workloads to run at global scale.
PythonMachine LearningPyTorchCI/CD
Posted 10 days ago
Apply