AI Benchmark Engineer - Native Language Specialist
IndiaContractMiddle
Salary not disclosed
Apply NowOpens the employer's application page
Job Details
- Languages
- Native or near-native fluency in a language other than English, High English proficiency
- Experience
- 5+ years
- Required Skills
- Python
Requirements
- 5+ years of professional experience in software engineering or a related technical field.
- Background working with leading technology companies or strong academic foundation from top-tier engineering institutions.
- Native or near-native fluency in a language other than English, with deep understanding of grammar, structure, and contextual usage.
- Strong proficiency in Python, shell scripting, and data processing workflows.
- Hands-on experience with CLI/terminal-based development environments and familiarity with coding agents or AI-assisted tools.
- Strong understanding of multilingual computing challenges, including Unicode handling, encoding/decoding, and locale-specific behaviors.
- Knowledge of text processing edge cases such as bidirectional scripts, collation rules, non-Gregorian formats, and rendering constraints.
- High English proficiency for collaboration, documentation, and technical communication.
Responsibilities
- Design and engineer high-quality Terminal-Bench tasks to evaluate multilingual performance of AI coding agents in realistic environments.
- Create and maintain multilingual datasets and file-based assets in your native language, ensuring linguistic integrity without translation simplification.
- Identify AI failure points in non-English prompts and workflows, and design challenges that rigorously test robustness.
- Develop reference implementations and deterministic verifier scripts to ensure reliable and reproducible evaluation outputs.
- Calibrate task difficulty levels (Easy to Very Hard) based on model performance analysis and execution logs across different AI tiers.
- Participate in structured multi-layer quality assurance processes, including creation review, calibration validation, and audit checks.
- Ensure benchmark fairness, grammatical accuracy, and technical integrity through both manual review and automated validation systems.
View Full Description & ApplyYou'll be redirected to the employer's site