Research Engineer - AI Systems

New
Y
Yotta LabsAI Infrastructure
United States, Hong Kong, Canada, SingaporeFull-Time
SalaryCompetitive compensation with equity.
Apply NowOpens the employer's application page

Job Details

Required Skills
PythonPyTorchC++

Requirements

  • Proficiency in AI programming languages such as Python and C++.
  • Deep understanding of GPU architecture and performance optimization.
  • Experience with CUDA, Triton, ROCm/HIP, or AWS Neuron.
  • Strong understanding of AI frameworks (e.g., PyTorch, Dynamo, LMCache).
  • Experience with model architectures and profiling tools (e.g. Nsight, ROCm Profiler, or Neuron Profiler).
  • Strong problem-solving skills and the ability to work in a collaborative, remote environment.
  • Background in computer science, engineering, or a related field.

Responsibilities

  • Design and implement high-performance kernels for Attention, MoE, GEMM, collective communication, and quantization.
  • Optimize kernels for NVIDIA, AMD, and AWS Trainium.
  • Develop custom operators and graph optimizations using Neuron SDK, PyTorch/XLA, Torch Dynamo, and Neuron Compiler.
  • Improve performance of vLLM, SGLang, TensorRT-LLM, and custom inference runtimes.
  • Design scalable distributed training and inference solutions across thousands of accelerators.
  • Contribute to open-source projects, publish technical findings and engage with the developer community.
View Full Description & ApplyYou'll be redirected to the employer's site
Competitive compensation with equity.
Apply Now