AI Benchmark Engineer - Native Language Specialist - German
New
L
LILT (Production)AI, Language Technology
Germany (Remote)ContractMiddle
Salary not disclosed
Apply NowOpens the employer's application page
Job Details
- Languages
- German, English
- Experience
- 1+ years
- Required Skills
- Python
Requirements
- 1+ years of industry experience in software or prompt engineering.
- Proven track record at leading technology companies and/or graduation from top-tier engineering universities.
- Native or near-native fluency in German, with a deep understanding of its grammar, register, and phrasing rules.
- High English proficiency.
- Strong proficiency in Python.
- Strong proficiency in standard shell scripting.
- Strong proficiency in data processing.
- Extensive experience with Terminal/CLI-based development workflows.
- Working familiarity with coding agents.
- Deep technical understanding of multilingual text processing pitfalls (encoding/decoding robustness, Unicode normalization, locale-dependent conventions, text I/O, toolchain interoperability, safe string operations).
Responsibilities
- Design, build, and validate benchmarks to test large language models on multilingual software challenges.
- Create high-signal, high-quality tasks that genuinely test a model's ability to handle multilingual environments.
- Build realistic task environments using datasets and files in your native language (German).
- Identify failure points where AI models do not work, in your native language.
- Support the development of robust solutions (reference implementations).
- Write highly reliable, deterministic verifier scripts.
- Analyze execution logs and calibrate task difficulty (Easy to Very Hard) using standard Terminal-Bench run configurations.
- Participate in a rigorous, 4-layer human quality control process (creation, human review, calibration review, and audit).
- Ensure fairness, grammatical accuracy, and benchmark integrity with automated LLM-based checks.
View Full Description & ApplyYou'll be redirected to the employer's site