Engineering Manager, Ads ML Efficiency
New
R
RedditMachine Learning
Remote - United StatesFull-TimeManager
Salary$230,000 — $322,000 USD
Apply NowOpens the employer's application page
Job Details
- Required Skills
- Machine LearningPyTorchDistributed Systems
Requirements
- Deep experience with ML training, serving, debugging, and optimization.
- Proven hands-on experience improving training loops, profiling workflows, or GPU utilization.
- Strong managerial experience leading teams, coaching engineers, and managing delivery.
- Demonstrated ability to reason about production-scale ML systems and performance tradeoffs.
- Strong communication skills with the ability to explain technical tradeoffs to diverse stakeholders.
- Experience in ads ranking, recommender systems, or marketplace ML is strongly preferred.
- Experience with GPU training and serving migrations is a plus.
- Familiarity with PyTorch or distributed training frameworks is a plus.
- Experience building efficiency benchmarking or launch certification frameworks is a plus.
Responsibilities
- Hire, mentor, and retain a high-performing team of ML and systems-oriented engineers.
- Define the roadmap for training and inference optimization, launch-readiness tooling, and efficiency primitives.
- Drive measurable reductions in model training time, online latency, serving cost, and infra-driven launch risk.
- Guide the development of profiling, benchmarking, load testing, observability, and efficiency certification systems.
- Partner with model owners and platform teams to accelerate high-priority launches and remove bottlenecks.
- Establish engineering rigor around performance debugging, launch safety, and technical decision-making.
View Full Description & ApplyYou'll be redirected to the employer's site