Principal QA Engineer (AI Systems & Platform)
New
P
Peach PilotInsurance
Fully Remote — US BasedFull-TimePrincipal
Salary140,000 - 180,000 USD per year
Apply NowOpens the employer's application page
Job Details
- Experience
- 7+ years of QA engineering experience, with at least 3 years in a senior or lead capacity
- Required Skills
- DockerPostgreSQLPythonArtificial IntelligenceGCPTypeScriptAPI testingAzureFastAPIRedisNext.jsReactCI/CDGitHub ActionsLLMPlaywright
Requirements
- 7+ years of QA engineering experience, with at least 3 years in a senior or lead capacity where you shaped process and standards.
- Tested AI/LLM-powered applications, understanding prompt sensitivity, output variance, and how to build eval pipelines that catch regressions across model updates.
- Write test code, with Python as a primary tool, and have built and maintained CI/CD-integrated test suites.
- Hands-on experience with Playwright and Vitest in a production environment, building automation frameworks from scratch.
- Comfortable testing complex API chains, async/streaming responses, and multi-service workflows; data pipelines and knowledge graph outputs don't intimidate you.
- Built a QA function from the ground up in an early-stage environment, knowing when to move fast and when to go deep.
- Test for confusion and trust failure — not just broken functionality — advocating for non-technical executives.
Responsibilities
- Build the QA Foundation: Establish the testing framework (unit, integration, end-to-end, AI-specific evaluation pipelines) using Playwright and Vitest.
- Define quality standards, test coverage requirements, and documentation practices in partnership with the Lead Engineer.
- Audit the existing platform and identify the highest-risk surfaces before the next client deployment.
- Define the team structure you will need — onshore vs. offshore mix, roles, and a hiring roadmap — and begin executing against it.
- Design evaluation frameworks for non-deterministic LLM outputs — including prompt regression testing, model drift detection, and output quality scoring.
- Build automated test suites for the agent orchestration layer, including governance agent audit trail integrity and human-override behavior.
- Validate the Company Brain (Memgraph + Qdrant) for data accuracy, retrieval quality, and failure modes under real enterprise data conditions.
- Own end-to-end testing of the data ingestion pipelines that connect to client systems through Nango's 700+ connector integration layer.
- Test multi-model routing logic to confirm cost-optimized task allocation behaves correctly across LLM providers via LiteLLM.
- Recruit, hire, and onboard QA engineers as the team grows, setting clear expectations and working standards.
View Full Description & ApplyYou'll be redirected to the employer's site