Apply

Member of Technical Staff - Pretraining / Inference Optimization

Posted 2024-11-07

View full description

📍 Location: Germany, USA

🔍 Industry: Generative image and video models

🏢 Company: Black Forest Labs

🪄 Skills: PythonSoftware DevelopmentArtificial IntelligenceGitMachine LearningNumpyPyTorchAlgorithmsGoLinux

Requirements:
  • Familiarity with effective techniques in optimizing inference and training workloads.
  • Knowledge in optimizing for both memory-bound and compute-bound operations.
  • Understanding of GPU memory hierarchy and computation capabilities.
  • Deep understanding of efficient attention algorithms.
  • Experience implementing forward and backward Triton kernels with a focus on correctness and floating-point errors.
  • Ability to integrate custom-written kernels into a PyTorch framework using tools like pybind.
Responsibilities:
  • Finding ideal training strategies for various model sizes and compute loads.
  • Profiling, debugging, and optimizing single and multi-GPU operations using tools such as Nsight.
  • Reasoning about speed and quality trade-offs of quantization for model inference.
  • Developing and improving low-level kernel optimizations for state-of-the-art inference and training.
  • Innovating new ideas to maximize GPU performance.
Apply