Member of Engineering (Inference)

Posted 6 months agoViewed

Europe, North AmericaFull-TimeSoftware Development

Company:poolside

Location:Europe, North America, EST, PST, CET, CEST, BST

Languages:English

Skills:

AWSPythonSoftware DevelopmentArtificial IntelligenceGitMachine LearningNumpyPyTorchC++AlgorithmsData StructuresLinuxCritical thinkingResearch

Requirements:

Experience with Large Language Models (LLM). Confident knowledge of transformer computational properties. Knowledge/Experience with cutting-edge inference tricks. Knowledge/Experience of distributed and lower precision inference. Knowledge of deep learning fundamentals. Strong engineering background. Theoretical computer science knowledge is a must. Experience with programming for hardware accelerators. SIMD algorithms. Expert in matrix multiplication bottlenecks. Know hardware operation latencies by heart. Research experience. Programming experience with Linux and Git. Python with PyTorch or Jax. C/C++, CUDA, Triton, ThunderKittens.

Responsibilities:

Follow latest research on LLMs, inference, and source code generation. Propose and evaluate innovations in inference quality and efficiency. Monitor and implement LLM inference metrics in production. Write high-quality, high-performance code in Python, Cython, C/C++, Triton, ThunderKittens, native CUDA, and Amazon Neuron. Collaborate with the team, plan future steps, and maintain communication.