Forward Deployed Engineer (Generative AI)
New
T
Tiger Analytics Inc.Advanced Analytics Consulting
United States. CanadaFull-TimeMiddle
Salary not disclosed
Apply NowOpens the employer's application page
Job Details
- Required Skills
- PythonCloud ComputingKubernetesPyTorchGoTerraformGenerative AI
Requirements
- Hands-on experience with LLM orchestration tools (LangChain, LlamaIndex, AutoGen) and deep learning frameworks (PyTorch, Hugging Face).
- Production experience setting up and querying vector stores (Milvus, Pinecone, Qdrant, Chroma, or pgvector).
- Proficiency in model serving frameworks (vLLM, TGI, Triton Inference Server) and evaluation tools.
- Advanced knowledge of cloud AI primitives (AWS Bedrock/SageMaker, Azure OpenAI, GCP Vertex AI) and Kubernetes (K8s) for GPU workloads.
- Mastery of Terraform or OpenTofu to provision complex multi-cloud compute environments.
- Strong coding skills in Python (preferred) or Go, with an emphasis on writing clean, concurrent code.
- Ability to manage customer expectations around LLM non-determinism, hallucinations, and performance trade-offs.
- Willingness to travel to client sites to lead high-stakes, on-site deployment sprints.
Responsibilities
- Deploy, fine-tune, and optimize large-scale Gen AI models and LLM orchestration frameworks within customer cloud environments.
- Architect scalable infrastructure for AI workloads utilizing GPU/TPU orchestration, high-performance storage, and low-latency networking.
- Design and implement high-throughput data ingestion pipelines and Vector Database architectures for Retrieval-Augmented Generation (RAG).
- Build agnostic, resilient cloud deployments across AWS, Azure, and GCP using Infrastructure as Code (IaC).
- Act as the primary technical consultant, guiding enterprise clients through AI safety, prompt engineering patterns, and inference cost optimization.
- Feed edge-case deployment insights back to core AI research and platform engineering teams to improve product robustness.
View Full Description & ApplyYou'll be redirected to the employer's site