Member of Technical Staff - Model Serving / API Backend Engineer
New
B
Black Forest LabsGenerative AI
Freiburg (Germany), San Francisco (USA) or work remotely with a monthly in-person week to stay connectedFull-Time
Salary180,000 - 300,000 USD per year
Apply NowOpens the employer's application page
Job Details
- Required Skills
- AWSDockerPythonGCPKubernetesAzureFastAPIPostgresRedisReactCI/CD
Requirements
- Built and operated systems at meaningful scale
- Understand the difference between a research prototype and a production system
- Comfortable navigating ambiguity, making tradeoffs, and improving systems under real-world constraints
- Strong judgment around performance, reliability, and cost tradeoffs
- Experience scaling APIs or ML systems under load
- Comfort working in fast-moving, research-adjacent environments
- Ownership from system design through debugging and deployment
- Building and operating ML inference services in production
- Designing scalable API architectures with async processing
- Optimizing GPU workloads (batching, quantization, compilation, CUDA)
- Managing distributed systems and task queues under variable load
- Implementing monitoring and observability for production ML systems
- Debugging performance bottlenecks across model, infrastructure, and network layers
Responsibilities
- Turn research checkpoints into production-ready inference services
- Design and maintain high-performance APIs serving millions of requests
- Optimize inference latency and throughput across GPU infrastructure
- Build scalable serving architectures that handle unpredictable traffic
- Improve reliability, monitoring, and observability across model-serving systems
- Prototype and ship demos that showcase new capabilities in days, not weeks
- Collaborate closely with researchers to move from idea to live endpoint rapidly
View Full Description & ApplyYou'll be redirected to the employer's site