Design, ship, and maintain Python/FastAPI services for LLM workflows and 3D context retrieval. Optimize latency and throughput for async pipelines and GPU usage. Enforce auth-first design for APIs and websockets. Manage GCP operations including Cloud Run/Functions, Pub/Sub, Postgres, Redis, CI/CD. Establish observability with structured logging, tracing, SLOs/alerts, and dashboards. Partner with 3D/ML engineers to productize models via stable APIs.