Human Data Evals Lead

New
Remote / Latam / USFull-TimeLead
Salary not disclosed
Apply NowOpens the employer's application page

Job Details

Languages
Fluent English
Experience
5+ years

Requirements

  • 5+ years in technical delivery, quality, or program management.
  • Recent experience in AI/ML data, model evaluation, or benchmarking.
  • Hands-on experience delivering data or evaluation work to AI labs or enterprise ML teams.
  • Working fluency with frontier model evaluation (benchmarks, rubrics, pass rates, headroom).
  • Proven experience recruiting, calibrating, and leading teams or expert pools.
  • Experience translating eval targets into sample tasks that demonstrate capability.
  • Ability to build QC processes that meet enterprise or lab-grade standards.
  • Fluent English proficiency.
  • Ability to thrive in fast-moving, ambiguous environments.

Responsibilities

  • Study public benchmarks and eval targets to create proposals and sample packages that demonstrate capability.
  • Design and build sample packages in collaboration with subject-matter experts, ensuring expert-verified ground truth.
  • Develop rigorous QC structures including calibration layers, rubrics, and deterministic verifiers.
  • Recruit, brief, and calibrate a pool of subject-matter experts across coding and STEM domains.
  • Manage lab relationships as a primary point of contact for project updates and requirements.
  • Own pilot delivery end-to-end, including scoping, SOW, staffing, production, and QC.
View Full Description & ApplyYou'll be redirected to the employer's site
View details
Apply Now