Follow latest research on LLMs, inference, and source code generation. Propose and evaluate innovations in inference quality and efficiency. Monitor and implement LLM inference metrics in production. Write high-quality, high-performance code in Python, Cython, C/C++, Triton, ThunderKittens, native CUDA, and Amazon Neuron. Collaborate with the team, plan future steps, and maintain communication.