Senior Software Engineer II - Applied AI and Evaluations

USFull-TimeSenior
Salary175000 - 245000 USD per year
Apply NowOpens the employer's application page

Job Details

Experience
8+ years
Required Skills
PythonMLFlowDatabricksPrompt Engineering

Requirements

  • 8+ years of software engineering experience
  • 2+ years working directly with LLMs in production
  • Deep, hands-on experience with prompt engineering
  • Deep, hands-on experience with context engineering
  • Strong working knowledge of RAG architectures
  • Experience building or extending LLM evaluation frameworks
  • Fluency in agent system design
  • Strong Python skills
  • Comfortable working in data-heavy environments (Databricks, Delta tables, or equivalent)
  • Ability to communicate complex quality findings (written and verbal) to both technical and non-technical stakeholders
  • Strong cross-functional judgment
  • A bias for clarity in ambiguous situations
  • BS or MS in Computer Science, a related field, or equivalent industry experience
  • Experience with MLflow or similar experiment tracking platforms (Strong Plus)
  • Familiarity with CI-integrated evaluation pipelines (Strong Plus)
  • Experience with multi-agent orchestration frameworks (Strong Plus)
  • Prior work in an Applied AI or LLMOps function within a product company (Strong Plus)

Responsibilities

  • Own agent quality end-to-end: diagnosis, improvement, and validation across SmartAssist's orchestrator and subagents
  • Identify failure modes across quality dimensions factual accuracy, completeness, tone, actionability, and latency and prioritize what to fix
  • Drive quality improvements through prompt engineering, context engineering, and RAG retrieval tuning
  • Extend and mature our evaluation framework: scorers, golden datasets, regression gates, and online evaluation for production traffic
  • Close the feedback loop ensure that every change has a measurable, attributable quality signal
  • Collaborate with our Agent Architecture lead to distinguish quality problems that require prompt/context solutions from those that require structural fixes
  • Establish repeatable methodology that scales beyond any single agent or subagent
View Full Description & ApplyYou'll be redirected to the employer's site
175000 - 245000 USD per year
Apply Now