Product QA Engineer - Gen AI
Inactive
Work from anywhereFull-TimeSenior
Salary18000 - 24000 USD per year
Job Details
- Languages
- English
- Required Skills
- LeadershipPythonSoftware DevelopmentAgileArtificial IntelligenceCloud ComputingMachine LearningProduct ManagementQAQA AutomationUser Experience DesignProduct DevelopmentAPI testingManual testingRegression testingTestRailCI/CDAgile methodologiesDevOpsMicroservicesSoftware Engineering
Requirements
- Experience testing GenAI or LLM-driven products, including common failure modes such as hallucinations, unsafe responses, bias, and brittle decision paths.
- Exposure to performance and load testing tools and practices for web applications and APIs.
- Familiarity with structured exploratory testing approaches and test charters, especially for AI behavior and agent decision-making.
- Prior experience in high-velocity environments (e.g., startups) where QA acts as an owner of quality rather than a purely executional function.
- Prefer automation over repetition, while recognizing the value of focused exploratory testing
Responsibilities
- Own the full QA lifecycle for Agentic AI products: strategy, design, execution, reporting, and release sign-off.
- Design and run test plans covering functional, regression, smoke, exploratory, and usability testing for AI behavior and decision chains.
- Validate multi-step decision flows and reasoning to catch logic gaps, guardrail failures, or requirement mismatches.
- Perform structured exploratory testing to uncover unexpected behaviors, edge cases, and cascading AI failures.
- Build synthetic test scripts for UI elements, APIs, and end-to-end flows to verify functionality.
- Test across platforms (web, mobile, integrations) for consistency and performance.
- Maintain dashboards tracking test coverage, failures, and quality KPIs for all stakeholders.
- Improve test reliability: fix flakiness, optimize parallel runs, and cut execution time.
- Partner with Product, Design, and Engineering to refine requirements and set clear go/no-go criteria.
- Monitor pre- and post-release quality; use data to enhance AI evaluation and guardrails.