Design and implement high-performance kernels for Attention, MoE, GEMM, collective communication, and quantization.
Optimize kernels for NVIDIA, AMD, and AWS Trainium.
Develop custom operators and graph optimizations using Neuron SDK, PyTorch/XLA, Torch Dynamo, and Neuron Compiler.
Improve performance of vLLM, SGLang, TensorRT-LLM, and custom inference runtimes.
Design scalable distributed training and inference solutions across thousands of accelerators.
Contribute to open-source projects, publish technical findings and engage with the developer community.
PythonPyTorchC++
Showing 1 of 5 positions
About Yotta Labs
Yotta Labs is building the GPU Cloud for efficient machine learning, pioneering a Decentralized Operating System (DeOS) to orchestrate AI workloads across diverse hardware at a planetary scale. You can access on-demand GPU instances and Model APIs, enabling AI companies, research labs, and enterprises to train and deploy cutting-edge models. Their platform aggregates geo-distributed GPUs, offering elastic and cost-effective access to AI compute. They unify fragmented GPU capacity into a single execution fabric, making high-performance computing accessible for AI training and inference. You can deploy GPU pods in under 3 seconds.
How We Work
You will find a flexible, remote work environment that values innovation and autonomy. Your team members are encouraged to collaborate and contribute to open-source projects. You will work within a small, focused team, directly impacting strategic decisions. This environment supports strong problem-solving skills and independent contribution.
Engineering at Yotta Labs
You will build the next-generation AI infrastructure, developing a Decentralized Operating System (DeOS) for AI workload orchestration across multi-cloud and multi-silicon environments. Engineers optimize AI workloads across a heterogeneous network of GPUs. Your work will involve cutting-edge technologies that bridge AI and decentralized computing. You will contribute to high-performance computing for AI training and inference on a wide spectrum of hardware. This means tackling challenges in distributed systems, cloud computing, and blockchain technologies.
Why Join Us
Be part of a visionary team redefining AI infrastructure.
Work on cutting-edge technologies bridging AI and decentralized computing.
Collaborate with experts from leading institutions and tech companies.
Enjoy a flexible, remote work environment valuing innovation and autonomy.
Directly impact the scalability and performance of AI applications.