Software Engineer, Benchmarking
New
This role is fully remote, and we are able to hire in many countries.Full-TimeMiddle
Salary125,000 - 200,000 USD per year
Apply NowOpens the employer's application page
Job Details
- Experience
- More than two years
- Required Skills
- CybersecuritySoftware Engineering
Requirements
- More than two years of professional experience building and maintaining complex systems.
- Solid software engineering background.
- Ability to regularly contribute high-quality, robust, and maintainable code.
- Comfortable diving deep into existing codebases and infrastructure.
- Ability to generate original ideas for new benchmarks, experiments, and projects.
- Mission-driven mindset motivated by providing rigorous, independent insight into AI trends.
- Ability to learn quickly.
- Hands-on experience running LLM evaluations (plus, not required).
- Familiarity with evaluation frameworks like Inspect (plus, not required).
- Solid grasp of current AI trends (plus, not required).
- Cybersecurity experience (plus, not required).
Responsibilities
- Implement AI benchmarks within our evaluation infrastructure (primarily using the Inspect library) to expand the suite of capabilities we track.
- Develop our existing suite of benchmarks so we can quickly and painlessly evaluate new model releases.
- Contribute to the development of brand new benchmarks.
- Pitch and prototype your own ideas for new benchmarks and experiments.
- Work closely with researchers, analysts, and other engineers to ensure evaluation data and outputs are accurate, insightful, and effectively integrated into research products.
View Full Description & ApplyYou'll be redirected to the employer's site