Staff Research Engineer, Model Efficiency
New
C
CohereArtificial Intelligence
We have offices in Toronto, Montreal, San Francisco, New York, Paris, Seoul and London. We embrace a remote-friendly environment... You'll find the Model Efficiency team concentrated in the EST and PST time zones, these are our preferred locations., EST and PSTFull-TimeStaff
Salary not disclosed
Apply NowOpens the employer's application page
Job Details
- Required Skills
- Machine LearningSoftware EngineeringLLM
Requirements
- PhD in Machine Learning or a related field.
- Deep understanding of LLM architecture.
- Experience optimizing LLM inference given resource constraints.
- Significant experience with one or more techniques that enhance model efficiency.
- Strong software engineering skills.
- Ability to work in a fast-paced, high-ambiguity start-up environment.
- Publications at top-tier conferences and venues (ICLR, ACL, NeurIPS).
- Passion to mentor others.
Responsibilities
- Develop, prototype, and deploy techniques that materially improve how fast and efficiently our models run in production.
- Optimize model architecture and MoE routing.
- Implement decoding and inference-time algorithm improvements.
- Perform software/hardware co-design for GPU acceleration.
- Execute performance optimization without compromising model quality.
- Mentor other engineers.
View Full Description & ApplyYou'll be redirected to the employer's site