Senior Machine Learning Engineer - (GenAI)

Posted 10 days agoViewed

View full description

💎 Seniority level: Senior

📍 Location: Australia

🔍 Industry: AI

🏢 Company: Leonardo.Ai

🗣️ Languages: English

🪄 Skills: PythonMachine LearningPyTorchCI/CD

Requirements:

Proven experience deploying diffusion-based models (e.g. latent diffusion, LoRA, ControlNet) into production environments, ideally across dozens or hundreds of GPUs.
Proficiency in Python and PyTorch, with a focus on optimised inference, model tuning, and memory-efficient execution.
Familiarity with model deployment tools and practices (e.g. model registries, workflow orchestration, CI/CD for ML).
Comfort with performance trade-offs, debugging large-scale systems, and delivering improvements fast.
Experience working in fast-moving, cross-functional teams shipping real-world AI products.
Ability to pivot quickly between deep technical work, product needs, and cross-functional alignment.

Responsibilities:

Build and maintain robust production pipelines that deploy generative models across multiple services, each with 100s of GPUs
Contribute to one of the world’s highest-throughput GenAI systems, generating millions of images and videos daily.
Utilise quantisation, compilation, caching, distillation, and multi-GPU parallelism to enhance throughput, latency, and stability.
Collaborate closely with researchers to productionise new capabilities, such as LoRAs, ControlNets, and custom architectures.
Tackle a wide range of problems — from orchestrating massive multi-GPU video pipelines to optimising end-to-end latency and hardening scalable workloads to run at global scale.

Apply

Related Jobs

Apply

🔥 Senior Machine Learning Engineer - (GenAI)

Posted 16 days ago

📍 Australia

🧭 Full-Time

🔍 AI

🏢 Company: Leonardo.Ai

🔧 Requirements

Strong experience building and managing MLOps pipelines using frameworks like Kubeflow, MLflow, or similar.
Proficiency in Python, focusing on writing high-performance, maintainable code.
Hands-on experience with AWS services (e.g., S3, EC2, SageMaker), and infrastructure-as-code tools like Terraform.
Deep understanding of Docker and container orchestration tools like Kubernetes.
Experience designing scalable ETL pipelines and working with SQL and NoSQL databases.

💡 Responsibilities

Design, build, and maintain robust MLOps pipelines to support the end-to-end lifecycle of machine learning models, including data preparation, training, deployment, monitoring, and retraining.
Integrate ComfyUI nodes and other workflow tools into the MLOps ecosystem, optimising for performance and scalability.
Collaborate with DevOps teams to implement and manage cloud infrastructure, focusing on AWS (e.g., S3, EC2, SageMaker) using tools like Terraform and CloudFormation.
Implement CI/CD pipelines tailored for machine learning workflows, ensuring smooth transitions from research to production.
Design and maintain scalable data pipelines for collecting, processing, and storing large volumes of data.
Automate data acquisition and preprocessing workflows, optimising I/O bandwidth and implementing efficient storage solutions.
Manage data integrity and ensure compliance with privacy and security standards.
Deploy machine learning models to production, ensuring robustness, scalability, and low latency.
Implement monitoring solutions for deployed models to track performance metrics, detect drift, and trigger retraining pipelines.
Continuously optimise inference performance using techniques like model quantisation, distillation, or caching strategies.
Work closely with cross-functional teams, including AI researchers, data engineers, and software developers, to support ongoing projects and align MLOps efforts with organisational goals.
Proactively identify opportunities to streamline and automate workflows, driving innovation and efficiency.
Operate independently to manage deadlines, deliverables, and high-quality solutions in a dynamic environment.

AWSDockerPythonSQLETLKubeflowKubernetesMachine LearningMLFlowData engineeringGrafanaPrometheusREST APICI/CDDevOpsTerraform

Posted 16 days ago

Apply