Principal ML Solutions Architect - Token Factory
New
N
NebiusCloud AI Infrastructure
You’re welcome to work remotely from the United States.Full-TimeSenior
Salary208,000 - 261,000 USD per year
Apply NowOpens the employer's application page
Job Details
- Experience
- 8+ years of experience in ML/AI systems, with at least 4 years focused on LLMs and generative AI
- Required Skills
- DockerPythonKubernetesGenerative AI
Requirements
- 8+ years of experience in ML/AI systems.
- At least 4 years focused on LLMs and generative AI.
- Demonstrated technical leadership in owning ambiguous, high-impact problems.
- Expert knowledge of the LLM ecosystem, including model architectures and fine-tuning approaches.
- Deep, hands-on command of inference optimization (quantization, KV-cache management, batching, routing).
- Experience running LLMs in production at scale, including deployment, operation, and debugging.
- Hands-on experience with LLM fine-tuning (SFT, LoRA, RL) and data curation.
- Experience building LLM evaluation pipelines, including LLM-as-a-judge setups.
- Proficiency with inference frameworks like vLLM, SGLang, and TensorRT-LLM.
- Strong Python programming skills.
- Excellent communication skills for translating technical concepts to diverse audiences.
Responsibilities
- Own complex, high-stakes customer engagements from architecture through production, driving measurable business value.
- Optimize LLM inference at the framework and hardware level and codify best practices into reusable playbooks.
- Lead supervised and reinforcement fine-tuning efforts to maximize model quality.
- Design and implement production-ready LLM solutions using Token Factory's inference services.
- Provide deep technical expertise in prompt engineering, RAG architectures, model selection, and cost/performance trade-offs.
- Partner with product, engineering, and research to prototype platform features and influence the roadmap.
- Guide customers from PoC to production with a focus on performance, reliability, and cost efficiency.
- Mentor senior and mid-level Solutions Architects and represent Token Factory externally through technical talks and blog posts.
View Full Description & ApplyYou'll be redirected to the employer's site