AI Benchmark & Datasets Engineer/Researcher Internship

EU, UK, US, or CanadaInternship
Salary not disclosed
Apply NowOpens the employer's application page

Job Details

Languages
English

Requirements

  • Experience with ML/LLM evaluation, data science, or technical product roles, ideally around benchmarks or experimentation.
  • Comfortable reading papers, leaderboards, and Github repos, and turning them into clear, repeatable benchmark specs.
  • Ability to talk comfortably with both engineers and customers, and translate between technical detail and business value.
  • Care about high‑quality data, reproducible experiments, and crisp documentation.
  • Respectful of others.
  • Fluent in English.
  • ICPC World Finalist, or an IOI, IMO, IOAI or IPhO medalist in High School.
  • Published a research paper at an A-rated or A*-rated venue (according to ICORE).
  • Completed coding projects - ideally with a GitHub repository showcasing previous work.
  • Interned at a leading Machine Learning research center (e.g. at: Google Brain / Deepmind, Apple, Meta, Anthropic, Nvidia, MILA).
  • Can get a warm recommendation from university faculty member.

Responsibilities

  • Proactively identify, prioritize, and curate relevant public and client-driven benchmarks across target use cases and markets.
  • Evaluate candidate benchmarks for clarity, data quality, evaluation methodology, and fit with the model roadmap.
  • Run benchmarks with baseline models to validate setup, uncover edge cases, and de-risk R&D runs.
  • Hand off “benchmark-ready” packages to R&D (specs, data, evaluation scripts, expected metrics, constraints).
  • Maintain a shared vocabulary and documentation around benchmarks, datasets, and evaluation formats that GTM and R&D can both use.
  • Track and organize benchmark results, model leaderboards, and “what good looks like” for different customers and scenarios.
  • Contribute to demos and public‑facing proof points based on benchmark outcomes.
  • Play a key role in defining and driving the benchmarking process for AI model evaluation.
View Full Description & ApplyYou'll be redirected to the employer's site
View details
Apply Now