Senior Staff Machine Learning Engineer

New

Source API remote eligibility restrictions: Canada; We are actively hiring for this role in the US and Canada.Full-TimeStaff

Salary not disclosed

Apply NowOpens the employer's application page

Job Details

Bachelor’s degree in Computer Science, Mathematics, or a related field; Master’s or Ph.D. preferred.
5–8+ years of industry experience building and deploying machine learning systems in production, including significant experience working with LLMs.
Strong expertise in NLP, Generative AI, transformer architectures, embeddings, and retrieval systems.
Proven experience designing and deploying Retrieval-Augmented Generation (RAG) systems in enterprise environments.
Experience building and evaluating complex agentic or multi-step LLM workflows.
Strong knowledge of modern ML frameworks and tools (e.g., PyTorch, TensorFlow, Hugging Face) and distributed/cloud-based infrastructure.
Demonstrated ability to optimize real-time ML systems for performance, scalability, and reliability.
Strong technical leadership skills, with the ability to influence cross-functional decisions and raise the engineering bar.

Lead the design and development of Cresta’s next-generation AI Agents and Agentic Assist systems, defining system architecture and core modeling approaches.
Architect intelligent, multi-step agent workflows that combine real-time guidance, knowledge retrieval, reasoning, summarization, and automated actions into cohesive production systems.
Design, deploy, and optimize LLM-powered systems, including Retrieval-Augmented Generation (RAG) pipelines, multi-agent orchestration, and domain-adapted models.
Improve reasoning, planning, and tool-use capabilities in real-world AI applications.
Develop evaluation strategies for complex, non-deterministic systems, including offline benchmarking, online experimentation, and LLM-as-a-judge methodologies.
Diagnose and mitigate real-world failure modes such as hallucinations, retrieval errors, tool misuse, prompt brittleness, and multi-step reasoning breakdowns.
Define and measure quality metrics (e.g., accuracy, faithfulness, task completion, latency, cost, robustness) to improve system reliability and performance.
Optimize AI systems for scalability, latency, security, and cost efficiency in production environments.
Collaborate cross-functionally with product, frontend, and backend teams to integrate AI capabilities seamlessly into Cresta’s platform.
Mentor engineers, contribute to technical strategy, and help shape the roadmap for Cresta’s AI systems.

View Full Description & ApplyYou'll be redirected to the employer's site