Lead end-to-end experimentation: hypothesis generation, metric design, experiment design, analysis, and interpretation. Build and maintain evaluation frameworks for LLMs. Develop predictive models, classification/ranking systems, and heuristics. Collaborate with prompt engineers & model builders to test strategies and analyze failure modes. Automate experiment pipelines: dashboards, monitoring, alerting, instrumentation. Use causal inference / observational studies when randomized experiments are not feasible. Present findings and recommendations to leadership; influence roadmap decisions. Drive experimentation in startup-like environments. Shape large-scale product experimentation. Lead and mentor teams of data scientists, analysts, and engineers.