Staff Data Scientist - Product Experimentation & Evaluation - US

Posted 3 months agoViewed

120000 - 260000 USD per year

United StatesFull-TimeData Science, Product Experimentation

Company:Checkmate

Location:United States

Languages:English

Seniority level:Staff, 8-12+ years

Experience:8-12+ years

Skills:

LeadershipPythonSQLArtificial IntelligenceData AnalysisMachine LearningCross-functional Team LeadershipStrategyProduct AnalyticsData scienceCommunication SkillsMentoringA/B testing

Requirements:

8-12+ years of experience in data science / ML roles, ideally with experiment design/product analytics. Proven track record in both startup-style and large-scale product experimentation. Experience leading teams, setting strategy, and driving execution in cross-functional environments. Strong background with statistical methods, causal inference, and rigorous measurement. Experience using LLMs / NLP / AI / prompt engineering or a closely related field. Excellent coding skills in Python (or similar), strong SQL, experience building and deploying models or analytic pipelines. Ability to work in cross-functional teams, translate technical results into business or product changes. Strong communication skills; ability to explain complex analyses to non-technical stakeholders. Experience fine-tuning or working with multiple LLM providers / APIs (Preferred). Experience with experiment platforms or building internal tooling for experimentation & model evaluation (Preferred). Experience in voice / ASR or other multi-modal data (Preferred).

Responsibilities:

Lead end-to-end experimentation: hypothesis generation, metric design, experiment design, analysis, and interpretation. Build and maintain evaluation frameworks for LLMs. Develop predictive models, classification/ranking systems, and heuristics. Collaborate with prompt engineers & model builders to test strategies and analyze failure modes. Automate experiment pipelines: dashboards, monitoring, alerting, instrumentation. Use causal inference / observational studies when randomized experiments are not feasible. Present findings and recommendations to leadership; influence roadmap decisions. Drive experimentation in startup-like environments. Shape large-scale product experimentation. Lead and mentor teams of data scientists, analysts, and engineers.