Senior Research Engineer, Evaluations (Speech-to-Text)
A
AssemblyAISpeech AI
Eastern US Time ZoneFull-TimeSenior
Salary210000 - 260000 USD per year
Apply NowOpens the employer's application page
Job Details
- Required Skills
- PythonSQLMachine LearningLLM
Requirements
- ML fundamentals: Understand how ML models are trained and evaluated well enough to interpret results and debug issues
- Strong Python skills: Write clean evaluation scripts, work with data pipelines
- Comfortable with SQL
- Comfortable with cloud infrastructure
- Metric intuition: Understand what makes a good evaluation metric, when to use relative vs. absolute improvements, and how to ensure statistical rigor
- Voice agent stack familiarity: Understand how the components of a voice agent system interact (VAD, ASR, turn detection, LLM, TTS)
- Tinkerer mentality: Ship something rough and iterate rather than spending weeks perfecting it
- Communication skills: Explain technical results to researchers, summarize findings for leadership, and translate customer feedback into requirements
- Ownership mindset: See gaps and fill them without waiting to be told what to evaluate
- Work at least 3-4 hours overlapping with Eastern US Time Zone
Responsibilities
- Own end-to-end and integration-level model evaluation across accuracy, latency, and feature-specific metrics (e.g., turn detection latency, endpointing accuracy)
- Build and maintain competitive benchmarking pipelines against other providers in the market
- Design and run systematic experiments to measure the impact of model changes
- Onboard, curate, and maintain evaluation datasets—both public benchmarks and internal test sets
- Create evaluation subsets that stress-test specific capabilities and edge cases
- Define evaluation metrics that capture real-world performance
- Translate qualitative customer feedback into quantifiable evaluation criteria
- Work with customer-facing teams to understand pain points and convert them into research priorities
- Reduce friction for researchers by maintaining clean evaluation pipelines and clear documentation
- Identify evaluation gaps proactively and propose solutions
- Move fast—iterate on benchmarking approaches weekly, not monthly
View Full Description & ApplyYou'll be redirected to the employer's site