Member of Technical Staff - Pretraining / Inference Optimization

Posted 7 months agoViewed

📍 Location: Germany, USA

🔍 Industry: Generative image and video models

🪄 Skills: PythonSoftware DevelopmentArtificial IntelligenceGitMachine LearningNumpyPyTorchAlgorithmsGoLinux

Familiarity with effective techniques in optimizing inference and training workloads.
Knowledge in optimizing for both memory-bound and compute-bound operations.
Understanding of GPU memory hierarchy and computation capabilities.
Deep understanding of efficient attention algorithms.
Experience implementing forward and backward Triton kernels with a focus on correctness and floating-point errors.
Ability to integrate custom-written kernels into a PyTorch framework using tools like pybind.

Finding ideal training strategies for various model sizes and compute loads.
Profiling, debugging, and optimizing single and multi-GPU operations using tools such as Nsight.
Reasoning about speed and quality trade-offs of quantization for model inference.
Developing and improving low-level kernel optimizations for state-of-the-art inference and training.
Innovating new ideas to maximize GPU performance.