Proven experience deploying diffusion-based models (e.g. latent diffusion, LoRA, ControlNet) into production environments, ideally across dozens or hundreds of GPUs. Proficiency in Python and PyTorch, with a focus on optimised inference, model tuning, and memory-efficient execution. Familiarity with model deployment tools and practices (e.g. model registries, workflow orchestration, CI/CD for ML). Comfort with performance trade-offs, debugging large-scale systems, and delivering improvements fast. Experience working in fast-moving, cross-functional teams shipping real-world AI products. Ability to pivot quickly between deep technical work, product needs, and cross-functional alignment.