Staff Research Engineer, Model Efficiency

New
C
CohereArtificial Intelligence
We have offices in Toronto, Montreal, San Francisco, New York, Paris, Seoul and London. We embrace a remote-friendly environment... You'll find the Model Efficiency team concentrated in the EST and PST time zones, these are our preferred locations., EST and PSTFull-TimeStaff
Salary not disclosed
Apply NowOpens the employer's application page

Job Details

Required Skills
Machine LearningSoftware EngineeringLLM

Requirements

  • PhD in Machine Learning or a related field.
  • Deep understanding of LLM architecture.
  • Experience optimizing LLM inference given resource constraints.
  • Significant experience with one or more techniques that enhance model efficiency.
  • Strong software engineering skills.
  • Ability to work in a fast-paced, high-ambiguity start-up environment.
  • Publications at top-tier conferences and venues (ICLR, ACL, NeurIPS).
  • Passion to mentor others.

Responsibilities

  • Develop, prototype, and deploy techniques that materially improve how fast and efficiently our models run in production.
  • Optimize model architecture and MoE routing.
  • Implement decoding and inference-time algorithm improvements.
  • Perform software/hardware co-design for GPU acceleration.
  • Execute performance optimization without compromising model quality.
  • Mentor other engineers.
View Full Description & ApplyYou'll be redirected to the employer's site
View details
Apply Now