Principal AI Research Scientist Post-Training Alignment

New
Remote options across CanadaFull-TimePrincipal
Salary not disclosed
Apply NowOpens the employer's application page

Job Details

Required Skills
Machine Learning

Requirements

  • Deep expertise in reinforcement learning and post-training methodologies (RLHF, RLAIF, DPO, PPO).
  • PhD or equivalent industry research experience in machine learning or AI.
  • Proven track record in leading or mentoring research teams.
  • Strong publication history in top-tier ML/AI venues.
  • Experience in alignment research, preference learning, or agentic AI.
  • Strong intuition for model behavior and failure modes.
  • Experience designing evaluation systems for deployment.
  • Familiarity with large-scale training infrastructure.
  • Ability to communicate complex technical concepts.
  • Experience working with or deploying production AI systems.

Responsibilities

  • Lead research and development in post-training methods for foundation models, including reinforcement learning, preference optimization, and alignment techniques such as RLHF, RLAIF, DPO, and PPO.
  • Design and develop novel algorithms that improve model reliability, controllability, reasoning ability, and alignment with human and system objectives.
  • Define and execute experimental frameworks to evaluate model behavior, robustness, safety, and long-horizon reasoning performance.
  • Architect evaluation systems for agentic workflows, tool use, and real-world task completion.
  • Make principled decisions on model improvements.
  • Lead model analysis and interpretability efforts.
  • Collaborate with infrastructure teams to build scalable post-training pipelines.
  • Establish model readiness criteria for deployment.
  • Contribute to scientific publications and patents.
  • Communicate technical risks and strategic trade-offs.
View Full Description & ApplyYou'll be redirected to the employer's site
View details
Apply Now