Staff Applied Researcher, AI Quality

New
Work from anywhere within the United StatesFull-TimeStaff
Salary140,400 - 372,300 USD per year
Apply NowOpens the employer's application page

Job Details

Experience
4–8 years
Required Skills
PythonMachine LearningTypeScriptData scienceSoftware Engineering

Requirements

  • Bachelor’s, Master’s, or PhD degree in Computer Science, Data Science, Mathematics, Statistics, Physics, Economics, Operations Research, or a related technical field, or equivalent practical experience.
  • Minimum of 4–8 years of experience in data science, machine learning, applied research, or related technical fields depending on educational background.
  • Strong software engineering expertise in Python and/or TypeScript, with experience building scalable ML, data, or evaluation pipelines in production environments.
  • Proven experience delivering research systems or AI evaluation frameworks in real-world production settings.
  • Deep understanding of large language model evaluation, alignment, reward modeling, safety assessments, or AI quality methodologies.
  • Experience with large-scale experimentation, benchmarking strategies, and online/offline model evaluation techniques.
  • Strong communication and cross-functional collaboration skills, with the ability to influence technical and product decisions.
  • Experience with developer tools, AI-assisted programming, or code generation systems is highly preferred.
  • Open-source contributions or experience engaging with developer communities is considered a strong advantage.

Responsibilities

  • Design and implement advanced evaluation frameworks for large language models, including code generation, reasoning, multimodal capabilities, safety, and agentic workflows.
  • Develop scalable evaluation methodologies such as automated metrics, reward models, LLM-judge systems, and human-in-the-loop evaluation pipelines.
  • Build and optimize benchmarking systems, datasets, experimentation pipelines, and production-grade ML evaluation tooling.
  • Collaborate closely with engineering, product, and design teams to integrate research findings into practical AI-powered applications and product experiences.
  • Lead initiatives focused on improving model quality, alignment, and performance across AI systems and developer tools.
  • Drive the onboarding and creation of challenging benchmarks for coding agents and advanced AI workflows.
  • Mentor researchers and engineers, promoting high technical standards, innovation, and effective execution practices.
  • Provide strategic guidance in ambiguous problem spaces and contribute to long-term AI quality and evaluation strategies.
View Full Description & ApplyYou'll be redirected to the employer's site
140,400 - 372,300 USD per year
Apply Now