Architect, implement, and optimize end-to-end AI inference services and agentic pipelines in Python. Design autonomous agents that can interpret, reason about, and act on video and multi-modal content. Integrate Vision Language Models (e.g., GPT-4o, Gemini Pro Vision, LLaVA) into robust, production-grade workflows. Leverage LLM/agent orchestration frameworks (e.g., LangGraph, AutoGen, Semantic Kernel or similar) to coordinate complex visual AI tasks. Deploy and operate services on Kubernetes (and potentially OpenShift or NVIDIA Holoscan), ensuring reliability and scalability under heavy media workloads. Architect distributed systems on AWS, making informed trade-offs across performance, cost, and resilience. Optimize workloads for modern NVIDIA GPU architectures (Ampere, Hopper, Blackwell), focusing on real-time and high-throughput media use cases. Collaborate directly with clients in MEGS, including participating in pre-sales discussions to validate feasibility, shape solutions, and clarify the “why” behind requirements. Create clear architecture diagrams and technical documentation that align both technical and non-technical stakeholders. Provide technical leadership to project teams, guiding implementation to stay true to the intended architecture and product value. Work with video tooling such as FFmpeg, GStreamer, NVENC/NVDEC, and modern codecs (H.264/5), and explore emerging tools such as Mojo or NVIDIA Holoscan for Media. Design and deploy AI solutions to edge devices and on-premise or hybrid clusters.