Senior Staff Machine Learning Engineer
New
Source API remote eligibility restrictions: Canada; We are actively hiring for this role in the US and Canada.Full-TimeStaff
Salary not disclosed
Apply NowOpens the employer's application page
Job Details
- Experience
- 5–8+ years
- Required Skills
- Machine LearningPyTorchTensorflowNLPGenerative AI
Requirements
- Bachelor’s degree in Computer Science, Mathematics, or a related field; Master’s or Ph.D. preferred.
- 5–8+ years of industry experience building and deploying machine learning systems in production, including significant experience working with LLMs.
- Strong expertise in NLP, Generative AI, transformer architectures, embeddings, and retrieval systems.
- Proven experience designing and deploying Retrieval-Augmented Generation (RAG) systems in enterprise environments.
- Experience building and evaluating complex agentic or multi-step LLM workflows.
- Strong knowledge of modern ML frameworks and tools (e.g., PyTorch, TensorFlow, Hugging Face) and distributed/cloud-based infrastructure.
- Demonstrated ability to optimize real-time ML systems for performance, scalability, and reliability.
- Strong technical leadership skills, with the ability to influence cross-functional decisions and raise the engineering bar.
Responsibilities
- Lead the design and development of Cresta’s next-generation AI Agents and Agentic Assist systems, defining system architecture and core modeling approaches.
- Architect intelligent, multi-step agent workflows that combine real-time guidance, knowledge retrieval, reasoning, summarization, and automated actions into cohesive production systems.
- Design, deploy, and optimize LLM-powered systems, including Retrieval-Augmented Generation (RAG) pipelines, multi-agent orchestration, and domain-adapted models.
- Improve reasoning, planning, and tool-use capabilities in real-world AI applications.
- Develop evaluation strategies for complex, non-deterministic systems, including offline benchmarking, online experimentation, and LLM-as-a-judge methodologies.
- Diagnose and mitigate real-world failure modes such as hallucinations, retrieval errors, tool misuse, prompt brittleness, and multi-step reasoning breakdowns.
- Define and measure quality metrics (e.g., accuracy, faithfulness, task completion, latency, cost, robustness) to improve system reliability and performance.
- Optimize AI systems for scalability, latency, security, and cost efficiency in production environments.
- Collaborate cross-functionally with product, frontend, and backend teams to integrate AI capabilities seamlessly into Cresta’s platform.
- Mentor engineers, contribute to technical strategy, and help shape the roadmap for Cresta’s AI systems.
View Full Description & ApplyYou'll be redirected to the employer's site