Principal QA Engineer (AI Systems & Platform)

New

Fully Remote — US BasedFull-TimePrincipal

Salary140,000 - 180,000 USD per year

Apply NowOpens the employer's application page

Job Details

Experience: 7+ years of QA engineering experience, with at least 3 years in a senior or lead capacity
Required Skills: DockerPostgreSQLPythonArtificial IntelligenceGCPTypeScriptAPI testingAzureFastAPIRedisNext.jsReactCI/CDGitHub ActionsLLMPlaywright

7+ years of QA engineering experience, with at least 3 years in a senior or lead capacity where you shaped process and standards.
Tested AI/LLM-powered applications, understanding prompt sensitivity, output variance, and how to build eval pipelines that catch regressions across model updates.
Write test code, with Python as a primary tool, and have built and maintained CI/CD-integrated test suites.
Hands-on experience with Playwright and Vitest in a production environment, building automation frameworks from scratch.
Comfortable testing complex API chains, async/streaming responses, and multi-service workflows; data pipelines and knowledge graph outputs don't intimidate you.
Built a QA function from the ground up in an early-stage environment, knowing when to move fast and when to go deep.
Test for confusion and trust failure — not just broken functionality — advocating for non-technical executives.

Build the QA Foundation: Establish the testing framework (unit, integration, end-to-end, AI-specific evaluation pipelines) using Playwright and Vitest.
Define quality standards, test coverage requirements, and documentation practices in partnership with the Lead Engineer.
Audit the existing platform and identify the highest-risk surfaces before the next client deployment.
Define the team structure you will need — onshore vs. offshore mix, roles, and a hiring roadmap — and begin executing against it.
Design evaluation frameworks for non-deterministic LLM outputs — including prompt regression testing, model drift detection, and output quality scoring.
Build automated test suites for the agent orchestration layer, including governance agent audit trail integrity and human-override behavior.
Validate the Company Brain (Memgraph + Qdrant) for data accuracy, retrieval quality, and failure modes under real enterprise data conditions.
Own end-to-end testing of the data ingestion pipelines that connect to client systems through Nango's 700+ connector integration layer.
Test multi-model routing logic to confirm cost-optimized task allocation behaves correctly across LLM providers via LiteLLM.
Recruit, hire, and onboard QA engineers as the team grows, setting clear expectations and working standards.

View Full Description & ApplyYou'll be redirected to the employer's site