Lead applied research on prompt engineering, context design, and evaluation strategies for conversational and generative AI systems. Run controlled experiments and rapid iteration cycles to improve model performance and reliability. Build and maintain structured evaluation frameworks using tools like LangSmith, LangChain evaluators, or similar platforms. Partner with engineering, design, and product teams to translate research insights into impactful user experiences. Contribute to the development of context architectures and testing methodologies that scale. Clearly communicate findings and recommendations across technical and non-technical stakeholders.