3+ years of experience in applied deep learning research Solid understanding of neural network types, architectures, and loss mechanisms Proven experience with large language models (LLMs) Experience with data curation Experience with distributed large-scale training Experience with optimization of transformer architecture Experience with Reinforcement Learning (RL) training Strong coding experience in Python Experience working with PyTorch Experience with various transformer architectures (auto-regressive, sequence-to-sequence, etc.) Experience with distributed computing Experience with large-scale data processing Prior experience in conducting experimental programs and using results to optimize models