AI Research Engineer (Multi-Modal Reinforcement Learning)

New
United StatesFull-TimeMiddle
Salary not disclosed
Apply NowOpens the employer's application page

Job Details

Required Skills
Machine LearningPyTorchDeep LearningNLPComputer Vision

Requirements

  • Master’s degree in Computer Science or related field required; PhD preferred in ML, CV, NLP, or AI-related disciplines.
  • Strong publication record in top-tier AI conferences (NeurIPS, ICML, ICLR, CVPR, ICCV, ECCV).
  • Proven experience in large-scale reinforcement learning experiments, particularly in multi-modal or vision-centric systems.
  • Deep understanding of reinforcement learning theory, optimization, and policy learning in high-dimensional environments.
  • Strong hands-on experience with PyTorch and deep learning frameworks for multimodal AI systems.
  • Experience building end-to-end RL pipelines including simulation, training, evaluation, and deployment.
  • Ability to address core RL challenges such as sample efficiency, exploration-exploitation trade-offs, and training stability.
  • Strong analytical and problem-solving skills with a research-driven, experimental mindset.

Responsibilities

  • Conduct research on reinforcement learning methods for multi-modal systems, including diffusion-based and autoregressive model approaches.
  • Design and build scalable RL infrastructure supporting distributed training and evaluation across complex multi-modal environments.
  • Develop reward modeling strategies to improve alignment, training stability, and mitigate failure modes such as reward hacking.
  • Create and curate simulation environments and datasets for training, benchmarking, and validating multi-modal RL models.
  • Design and execute evaluation protocols to measure performance improvements and ensure reproducibility across experiments.
  • Analyze model behavior across modalities, identifying bottlenecks in optimization, exploration, and cross-modal alignment.
  • Explore and develop next-generation RL paradigms to enhance learning from environment feedback and improve SOTA performance.
  • Publish research in leading AI conferences such as NeurIPS, ICML, ICLR, CVPR, and related venues.
View Full Description & ApplyYou'll be redirected to the employer's site
View details
Apply Now