10+ years in software engineering, 5+ years in management roles with large-scale AI/ML systems and infrastructure. Expert-level proficiency in Python and Golang. 5+ years building production distributed systems. Experience with orchestration frameworks (Kubernetes, Ray, Dask). Proficiency with vector databases (Pinecone, Weaviate, Qdrant, or similar). Experience with message queuing systems (Kafka, Pulsar, RabbitMQ). In-depth knowledge and hands on experience building scalable distributed architectures and high-performance compute systems. Proven experience in multimodal ingestion pipelines within RAG platforms. Direct experience in designing, fine-tuning, and optimizing LLMs for ingestion and retrieval workloads. Previous success managing engineering teams delivering production-grade, HPC-scale RAG systems. Deep understanding of infra domains: compute, storage, networking, observability, security, disaster recovery, and cost management. Familiarity with HPC cluster management softwares such as Slurm. Familiarity with cloud platforms (AWS, Azure, GCP) and/or on-prem datacenter operations.