- Design, build, and maintain scalable ML, AI, and data-platform services that power production features end-to-end.
- Operate LLM applications in production: chat with memory, retrieval, prompt management, versioning, and experimentation.
- Build and run structured evaluation pipelines (golden datasets, regression checks) so changes ship with confidence.
- Own AI/ML observability and monitoring – instrument, trace, and debug model behavior in prod.
- Integrate and serve LLM-agnostic model backends and support deterministic/custom ML.
- Contribute to data lineage, governance, and compliance as the platform matures.
- Partner closely with product and the broader team to move fast without breaking quality.
AWSPythonGCP+4 more