Software Engineer, Benchmarking

New
This role is fully remote, and we are able to hire in many countries.Full-TimeMiddle
Salary125,000 - 200,000 USD per year
Apply NowOpens the employer's application page

Job Details

Experience
More than two years
Required Skills
CybersecuritySoftware Engineering

Requirements

  • More than two years of professional experience building and maintaining complex systems.
  • Solid software engineering background.
  • Ability to regularly contribute high-quality, robust, and maintainable code.
  • Comfortable diving deep into existing codebases and infrastructure.
  • Ability to generate original ideas for new benchmarks, experiments, and projects.
  • Mission-driven mindset motivated by providing rigorous, independent insight into AI trends.
  • Ability to learn quickly.
  • Hands-on experience running LLM evaluations (plus, not required).
  • Familiarity with evaluation frameworks like Inspect (plus, not required).
  • Solid grasp of current AI trends (plus, not required).
  • Cybersecurity experience (plus, not required).

Responsibilities

  • Implement AI benchmarks within our evaluation infrastructure (primarily using the Inspect library) to expand the suite of capabilities we track.
  • Develop our existing suite of benchmarks so we can quickly and painlessly evaluate new model releases.
  • Contribute to the development of brand new benchmarks.
  • Pitch and prototype your own ideas for new benchmarks and experiments.
  • Work closely with researchers, analysts, and other engineers to ensure evaluation data and outputs are accurate, insightful, and effectively integrated into research products.
View Full Description & ApplyYou'll be redirected to the employer's site
125,000 - 200,000 USD per year
Apply Now