AI Benchmark Engineer | Native Language Specialist - Serbian
New
L
LILT (Production)AI, Language Technology
Serbia (Remote)ContractSenior
Salary not disclosed
Apply NowOpens the employer's application page
Job Details
- Languages
- Native or near-native fluency in Serbian; High English proficiency.
- Experience
- 5+ years of industry experience in software engineering.
- Required Skills
- Python
Requirements
- 5+ years of industry experience in software engineering.
- Proven track record at leading technology companies and/or graduation from top-tier engineering universities.
- Native or near-native fluency in Serbian, with a deep understanding of its grammar, register, and phrasing rules. High English proficiency.
- Strong proficiency in Python, standard shell scripting, and data processing.
- Extensive experience with Terminal/CLI-based development workflows and a working familiarity with coding agents.
- Deep technical understanding of multilingual text processing pitfalls, including: Encoding/decoding robustness and Unicode normalization; Locale-dependent conventions (collation, casing, non-Gregorian dates); Text I/O, toolchain interoperability, and safe string operations.
Responsibilities
- Task Engineering: Evaluating Coding Agents.
- Asset Creation: Build realistic task environments using datasets and files in your native language.
- Prompting & Translation: finding failure points where AI does not work, in your native language
- Implementation & Verification: Support the development of robust solutions (reference implementations) and write highly reliable, deterministic verifier scripts (using rubric-based judging only when strictly necessary).
- Calibration & Execution: Analyze execution logs and calibrate task difficulty (Easy to Very Hard) using standard Terminal-Bench run configurations against various model tiers (Haiku, Sonnet, Opus).
- Quality Assurance: Participate in a rigorous, 4-layer human quality control process (creation, human review, calibration review, and audit) alongside automated LLM-based checks to ensure fairness, grammatical accuracy, and benchmark integrity.
View Full Description & ApplyYou'll be redirected to the employer's site