Senior Sales Engineer - AI Inference Platform

N
NebiusCloud Computing, AI
EuropeFull-TimeSenior
Salary not disclosed
Apply NowOpens the employer's application page

Job Details

Required Skills
AWSDockerPythonFlaskGCPGitKubernetesAzureFastAPILLMLangChain

Requirements

  • Deep understanding of AI inference systems and GPU-backed infrastructure
  • Experience with LLM workloads and performance-sensitive environments
  • Experience with inference frameworks and libraries (e.g., vLLM, SGLang, TensorRT-LLM)
  • Ability to reason about latency, throughput, cost, and architecture tradeoffs
  • Strong customer presence with engineering-first organizations
  • Comfort challenging assumptions and pushing back constructively
  • Commercial awareness – you understand that engineering time is a strategic resource
  • Programming Languages– Python
  • Frameworks and Libraries– vLLM, SGLang, TensorRT-LLM, OpenAI/Anthropic SDKs
  • Frameworks for Agentic Pipelines : Langchain / Langsmith / smolagents / equivalent
  • API and Web Frameworks– FastAPI, Flask
  • MLOps and DevOps tools– Kubernetes (K8s), Docker, Git
  • Cloud Platforms– AWS (SageMaker, Bedrock), GCP (Vertex AI), Azure (Azure ML)

Responsibilities

  • Lead deep technical discovery with engineering teams and technical founders
  • Understand model requirements, traffic expectations, latency constraints, GPU economics, and system dependencies
  • Translate customer ambition into production-feasible architectures
  • Identify hidden technical risks early
  • Partner tightly with Sales on strategic deals
  • Influence deal strategy through architectural clarity
  • Prevent misaligned commitments before engineering allocation
  • Increase PoC-to-production conversion by ensuring technical realism
  • Define measurable success criteria (latency, TTFT, throughput, cost envelope)
  • Classify workload complexity and required optimization depth
  • Align appropriate resources (ML Solution Architects, engineering, GPU capacity, etc.)
  • Drive structured Go / No-Go decisions
  • Prevent uncontrolled customization or hidden R&D
  • Identify recurring configuration patterns across customers
  • Quantify demand for advanced optimizations (quantization, speculative decoding, etc.)
  • Surface structured insights to Product and Engineering
  • Help evolve platform capabilities based on real workload data
View Full Description & ApplyYou'll be redirected to the employer's site
View details
Apply Now