- Lead and drive ambitious research initiatives that advance the state of the art in computer vision, multimodal understanding, and visual generation
- Develop novel models, algorithms, and training methodologies for challenging vision problems such as image understanding, video understanding, visual search, scene representation, segmentation, detection, generation, and multimodal reasoning
- Translate cutting-edge research into practical model improvements that can shape product direction and unlock new user experiences
- Design, implement, train, and optimize large-scale vision and multimodal foundation models across diverse datasets and tasks
- Partner closely with applied scientists, ML engineers, and product teams to move research from exploration to production-ready systems
- Set technical direction for high-impact research areas, identifying promising bets and influencing longer-term strategy for vision and multimodal AI
- Mentor other scientists and engineers, raise the quality bar for research, and help build a strong scientific culture across the team
- Stay at the forefront of research in computer vision, multimodal learning, generative modeling, and foundation models, and apply emerging techniques to real-world problems