Senior Sales Engineer - AI Inference Platform
N
NebiusCloud Computing, AI
EuropeFull-TimeSenior
Salary not disclosed
Apply NowOpens the employer's application page
Job Details
- Required Skills
- AWSDockerPythonFlaskGCPGitKubernetesAzureFastAPILLMLangChain
Requirements
- Deep understanding of AI inference systems and GPU-backed infrastructure
- Experience with LLM workloads and performance-sensitive environments
- Experience with inference frameworks and libraries (e.g., vLLM, SGLang, TensorRT-LLM)
- Ability to reason about latency, throughput, cost, and architecture tradeoffs
- Strong customer presence with engineering-first organizations
- Comfort challenging assumptions and pushing back constructively
- Commercial awareness – you understand that engineering time is a strategic resource
- Programming Languages– Python
- Frameworks and Libraries– vLLM, SGLang, TensorRT-LLM, OpenAI/Anthropic SDKs
- Frameworks for Agentic Pipelines : Langchain / Langsmith / smolagents / equivalent
- API and Web Frameworks– FastAPI, Flask
- MLOps and DevOps tools– Kubernetes (K8s), Docker, Git
- Cloud Platforms– AWS (SageMaker, Bedrock), GCP (Vertex AI), Azure (Azure ML)
Responsibilities
- Lead deep technical discovery with engineering teams and technical founders
- Understand model requirements, traffic expectations, latency constraints, GPU economics, and system dependencies
- Translate customer ambition into production-feasible architectures
- Identify hidden technical risks early
- Partner tightly with Sales on strategic deals
- Influence deal strategy through architectural clarity
- Prevent misaligned commitments before engineering allocation
- Increase PoC-to-production conversion by ensuring technical realism
- Define measurable success criteria (latency, TTFT, throughput, cost envelope)
- Classify workload complexity and required optimization depth
- Align appropriate resources (ML Solution Architects, engineering, GPU capacity, etc.)
- Drive structured Go / No-Go decisions
- Prevent uncontrolled customization or hidden R&D
- Identify recurring configuration patterns across customers
- Quantify demand for advanced optimizations (quantization, speculative decoding, etc.)
- Surface structured insights to Product and Engineering
- Help evolve platform capabilities based on real workload data
View Full Description & ApplyYou'll be redirected to the employer's site