Strong background in deep learning and transformer architectures Hands-on experience training or fine-tuning large models (LLMs or vision models) Proficiency with PyTorch, JAX, or TensorFlow Experience with distributed training frameworks Strong software engineering skills — writing robust, production-grade systems Experience with GPU optimization: memory efficiency, quantization, mixed precision Comfortable owning ambiguous, zero-to-one technical problems end-to-end