AI Benchmark Engineer - Native Language Specialist - Marathi

New

LILT (Production)AI, Language Technology

India (Remote)ContractMiddle

Salary not disclosed

Apply NowOpens the employer's application page

Job Details

5+ years of industry experience in software engineering.
Proven track record at leading technology companies and/or graduation from top-tier engineering universities.
Native or near-native fluency in Marathi, with deep understanding of its grammar, register, and phrasing rules.
High English proficiency.
Strong proficiency in Python.
Strong proficiency in standard shell scripting.
Strong proficiency in data processing.
Extensive experience with Terminal/CLI-based development workflows.
Working familiarity with coding agents.
Deep technical understanding of multilingual text processing pitfalls.
Experience with encoding/decoding robustness and Unicode normalization.
Knowledge of locale-dependent conventions (collation, casing, non-Gregorian dates).
Experience with Text I/O, toolchain interoperability, and safe string operations.
Experience with Bidirectional/RTL handling, font fallbacks, and rendering/typography in UI or artifacts (for specific languages).

Design, build, and validate evaluation suites of Terminal-Bench tasks.
Measure multilingual robustness across prompt language effects, non-English data processing, and complex locale/encoding edge cases.
Create high-signal, high-quality tasks testing model's ability to handle multilingual environments.
Evaluate Coding Agents through task engineering.
Build realistic task environments using datasets and files in native language (Marathi).
Identify AI failure points through prompting and translation in native language.
Support the development of robust solutions (reference implementations) and write reliable, deterministic verifier scripts.
Analyze execution logs and calibrate task difficulty (Easy to Very Hard) using standard Terminal-Bench run configurations.
Participate in a rigorous, 4-layer human quality control process (creation, human review, calibration review, and audit) alongside automated LLM-based checks.

View Full Description & ApplyYou'll be redirected to the employer's site