AI Evaluation Engineer - Mathematics & Algorithms
New
G
Gramian Consulting GroupIT Professional Services
Pakistan. Egypt. Kenya. Ghana. Nigeria. Brazil Bangladesh, Brazil, Colombia, Egypt, Ghana, India, Indonesia, Kenya, Nigeria,Turkey, Vietnam, 4 hours with PSTContractMiddle
Salary not disclosed
Apply NowOpens the employer's application page
Job Details
- Experience
- 5+ years
- Required Skills
- DockerPythonNumpy
Requirements
- 5+ years in mathematics, quantitative research, or computational science
- Competition math, university-level mathematics, or quantitative research background
- Python programming
- NumPy, SciPy, or symbolic computation (SymPy)
- Experience writing mathematical proofs or formal derivations
- Ability to create problems with precise, verifiable answers
- Experience with AI coding benchmarks (SWE-bench, Terminal-bench)
- Comfortable with Docker
- Writing Dockerfiles, building images, and debugging container issues
- Understanding of numerical methods
- Floating point tolerance, convergence criteria, error bounds
Responsibilities
- Design and build multi-agent benchmark tasks requiring multi-step mathematical reasoning and algorithmic problem-solving
- Create complex, decomposable problems across domains such as Competition mathematics, Numerical analysis, Combinatorial optimization, Statistical inference
- Develop verification scripts to validate: Numerical outputs (with tolerance thresholds), Proof correctness and logical steps, Algorithmic outputs and constraints
- Write clear, structured problem statements with precise notation and defined outputs
- Design task decomposition strategies for parallel or multi-agent execution
- Implement computational solutions and validation pipelines using Python
- Work with containerized environments (Docker) for reproducibility and evaluation
View Full Description & ApplyYou'll be redirected to the employer's site