Senior AI Platform Engineer, Core Cloud Engineering

New

VultrCloud Infrastructure

Remote - United StatesFull-TimeSenior

Salary110,000 - 140,000 USD per year

Apply NowOpens the employer's application page

Job Details

Required Skills: Docker

Requirements

Hands-on experience deploying and operating LLM inference systems (vLLM, SGLang, TGI, or comparable) at non-trivial scale.
Strong Docker and container skills; comfortable owning the full container lifecycle from image build to production.
Deep familiarity with GitLab CI/CD — pipeline authoring, custom runners, artifact management, and integrating external tooling.
Working knowledge of MCP or similar context-injection patterns for grounding LLMs against private or internal data.
Demonstrated ability to evaluate open-source models for specific task fit — not just benchmarks, but real use-case performance against internal workloads.
Strong software engineering fundamentals — this role writes real code, not just configuration.
Experience with RAG pipelines — vector databases, chunking strategies, retrieval evaluation — especially over code or technical documentation.
GPU infrastructure familiarity — CUDA basics, multi-GPU serving, memory management under inference load.
Ability to communicate technical tradeoffs clearly to engineers, managers, and leadership; track record of moving organizations toward new practices.

Responsibilities

Evaluate and curate open-source models (Llama, Mistral, Qwen, DeepSeek, Kimi, and others) for fit across engineering use cases including code generation, review, test writing, and summarization.
Build and maintain MCP (Model Context Protocol) servers that expose internal context (codebases, runbooks, incident history, architecture docs, development environments, and testing suites) to AI assistants and coding agents.
Integrate AI capabilities directly into GitLab CI/CD pipelines: automated code review, test generation, changelog drafting, PR summarization, and anomaly detection in build output.
Own the model lifecycle: versioning, A/B routing, quantization tradeoffs, and performance benchmarking under real engineering workloads.
Drive AI adoption across the software engineering organization — identify high-leverage workflows, instrument usage, and iterate based on real data on time-savings and quality impact.
Build and configure IDE tooling integrations (Cursor, Continue, and Copilot alternatives) backed by internal inference endpoints, keeping code off third-party APIs wherever possible.
Produce documentation, internal workshops, and working examples that help engineers go from AI-curious to AI-reliant — including a shared library of prompts, system instructions, and RAG pipelines tuned for Vultr’s stack.
Collaborate closely with Software Engineers, SREs, and Network Engineers to ensure the AI platform layer serves all teams without becoming a bottleneck or single point of failure.

View Full Description & ApplyYou'll be redirected to the employer's site