Machine Learning Researcher, Audio - Multimodal LLMs

New

BlandAI Phone Agents

Remote (US)Full-TimeMiddle

Salary180000 - 260000 USD per year

Apply NowOpens the employer's application page

Job Details

Experience with LLMs, multimodal models, or speech-language systems
Deep understanding of prompting techniques
Deep understanding of fine-tuning techniques
Deep understanding of alignment techniques
Ability to reason about full systems
Comfortable designing interactions between model, tools, prompts, and runtime constraints
Can go from idea → dataset → experiment → conclusion in days
Knows how to design experiments that actually answer the question
Strong sense for what makes an interaction feel natural vs robotic
Ability to translate abstract modeling ideas into user-facing improvements
Takes ownership from research through deployment
Thrives in ambiguous, fast-moving environments
Cares about impact, not just elegance

Spearhead the development of the next-generation multimodal LLM stack
Combine speech, text, tools, and real-time reasoning into a single unified system
Build industry-leading conversational AI models for Bland's agent
Take models from idea to production
Define how agents listen, think, and act in real time
Integrate streaming audio, tool execution, and dynamic context into a single coherent system
Take ideas from research through production systems serving millions of calls per day

View Full Description & ApplyYou'll be redirected to the employer's site