Research Engineer - AI Systems
New
Y
Yotta LabsAI Infrastructure
United States, Hong Kong, Canada, SingaporeFull-Time
SalaryCompetitive compensation with equity.
Apply NowOpens the employer's application page
Job Details
- Required Skills
- PythonPyTorchC++
Requirements
- Proficiency in AI programming languages such as Python and C++.
- Deep understanding of GPU architecture and performance optimization.
- Experience with CUDA, Triton, ROCm/HIP, or AWS Neuron.
- Strong understanding of AI frameworks (e.g., PyTorch, Dynamo, LMCache).
- Experience with model architectures and profiling tools (e.g. Nsight, ROCm Profiler, or Neuron Profiler).
- Strong problem-solving skills and the ability to work in a collaborative, remote environment.
- Background in computer science, engineering, or a related field.
Responsibilities
- Design and implement high-performance kernels for Attention, MoE, GEMM, collective communication, and quantization.
- Optimize kernels for NVIDIA, AMD, and AWS Trainium.
- Develop custom operators and graph optimizations using Neuron SDK, PyTorch/XLA, Torch Dynamo, and Neuron Compiler.
- Improve performance of vLLM, SGLang, TensorRT-LLM, and custom inference runtimes.
- Design scalable distributed training and inference solutions across thousands of accelerators.
- Contribute to open-source projects, publish technical findings and engage with the developer community.
View Full Description & ApplyYou'll be redirected to the employer's site