Strategic Technical Account Manager - GPU
V
VultrCloud Infrastructure
Remote - United StatesFull-TimeMiddle
Salary115000 - 140000 USD per year
Apply NowOpens the employer's application page
Job Details
- Experience
- 2–5+ years
- Required Skills
- KubernetesPyTorchTensorflow
Requirements
- 2–5+ years as an AI/ML Engineer, AI/ML Ops, Technical Account Manager, HPC Engineer, Sales/Solutions Engineer or relevant technical role
- Strong knowledge of GPU hardware architectures (NVIDIA/AMD)
- Strong knowledge of CUDA/ROCm
- Experience with distributed training and ML frameworks
- Experience with Linux tuning
- Experience with networking (Infiniband, RoCE fabrics)
- Experience with high-performance storage systems (DDN, NetApp, Vast, Weka, etc.)
- Ability to communicate complex concepts clearly to both executives and engineering teams
- Advanced hands on Kubernetes skills
- Advanced hands on SLURM skills
Responsibilities
- Lead onboarding for customers deploying GPU clusters (bare metal, VMs, or hybrid)
- Advise on cluster design: multi-GPU topology, NVLink/NVSwitch considerations, RDMA, Infiniband and RoCE Ethernet, networking throughput, and storage IOPS requirements
- Guide customers in selecting GPU types and configurations based on workload (training, fine-tuning, inference, embeddings, RAG pipelines)
- Support distributed frameworks: PyTorch, TensorFlow, DeepSpeed, Megatron, JAX, Ray, Mosaic, HuggingFace, etc.
- Identify bottlenecks (network, storage, memory bandwidth) and provide tuning recommendations
- Own the long-term technical strategy across assigned GPU/AI accounts
- Partner with Support, SRE, Networking, NOC, and Product Management & Engineering to resolve high-urgency incidents
- Identify opportunities for expanded clusters, high speed storage, or networking upgrades
- Provide structured feedback on existing and future GPU offerings, networking fabrics, storage platforms, and upcoming AI/ML platform features
View Full Description & ApplyYou'll be redirected to the employer's site