Strong experience with ML pipeline orchestration (Kubeflow, MLflow, or similar platforms) Expertise in ML production systems (model serving, versioning, monitoring, CI/CD for ML) Experience with distributed training (multi-GPU, multi-node) and hardware acceleration (CUDA, TensorRT, or similar) Familiarity with cloud platforms (AWS, GCP, or Azure) for compute, storage, and ML services Strong communication and collaboration skills