Senior AI QA Engineer
New
Opportunity to work remotely within Poland, 13:00 to 21:00 Polish timeFull-TimeSenior
Salary not disclosed
Apply NowOpens the employer's application page
Job Details
- Languages
- En B2
- Experience
- 5+ years of experience in software QA, with at least 1 year focused on testing AI agents
- Required Skills
- AWSPythonJiraAPI testingPrompt Engineering
Requirements
- 5+ years of experience in software QA, with at least 1 year focused on testing AI agents, agentic solutions or LLM-based systems
- Hands-on experience with both manual and automated testing of AI agents, including prompt/instruction testing and evaluation of agentic workflows
- Strong programming skills in Python test automation — pytest or equivalent
- Expertise in AI agent frameworks, prompt engineering and evaluation metrics for LLM-based systems
- Demonstrated experience testing and evaluating Gen AI / LLM applications
- Applied knowledge of Gen AI / LLM evaluation frameworks and metrics — precision, recall, criteria recall and efficiency
- Familiarity with issue and test management tools such as Jira, QMetry and TestRail
- Experience with version control systems and integrating tests into CI/CD pipelines
- Understanding of cloud environments, particularly AWS
- Excellent communication, collaboration and leadership skills
Responsibilities
- Research and evolve automation frameworks in line with Gen AI tooling and best practices
- Design and automate evaluation of Gen AI features — grounding, answer accuracy, determinism/reproducibility, precision, recall, and criteria recall
- Build automated LLM test harnesses that scale evaluation beyond human-in-the-loop
- Selection and application of Gen AI evaluation frameworks, measuring answer quality and pipeline efficiency
- Perform manual testing as needed to validate new features, integrations, and user stories
- Build and maintain test cases from requirements and user stories
- Test applications that may include AI agents, APIs, databases, and other integrations
- Collaborate with product, engineering, and operations teams to understand requirements and deployment environments
- Track and report test results, defects, and quality metrics
View Full Description & ApplyYou'll be redirected to the employer's site