AI Evaluation Engineer
New
G
Gramian Consulting GroupIT Consulting
Bangladesh, Brazil, Colombia, Egypt, Ghana, India, Pakistan, Indonesia, Kenya, Nigeria, Turkey, Vietnam, minimum 4h PST overlapContractMiddle
Salary not disclosed
Apply NowOpens the employer's application page
Job Details
- Experience
- 3–10 years
- Required Skills
- CybersecurityDevOpsSoftware EngineeringDebuggingMLOps
Requirements
- 3–10 years of experience in software engineering or related technical domains
- Strong debugging, analytical, and systems reasoning skills
- Good understanding of system architecture, dependencies, and operational processes
- Experience with terminal, CLI, automation, or developer tooling workflows
- Ability to design technically rigorous and realistic engineering scenarios
- Background in backend engineering, infrastructure, DevOps, data systems, MLOps, cybersecurity, or platform engineering
Responsibilities
- Design realistic terminal-based benchmark tasks for AI evaluation systems
- Create technically deep debugging and investigation scenarios
- Develop task specifications involving infrastructure, workflows, pipelines, or operational failures
- Write clear solution approaches and deterministic evaluation criteria
- Identify realistic edge cases, failure modes, and system constraints
- Design multi-step reasoning challenges across complex technical environments
- Contribute expertise across one or more engineering or operational domains
- Review and refine benchmark quality, difficulty, and validation logic
- Collaborate with reviewers and researchers on AI evaluation workflows
View Full Description & ApplyYou'll be redirected to the employer's site