Member of Technical Staff, Exceptional Generalist
New
I
InferactAI Infrastructure
Fully remote, worldwide., Expect regular overlap with Pacific Time for critical syncs.Full-TimeSenior
SalaryWe offer competitive compensations (salary + equity) compared to the local market conditions.
Apply NowOpens the employer's application page
Job Details
- Required Skills
- PythonKubernetesPyTorchC++GoRustLLMDistributed Systems
Requirements
- Bachelor's degree or equivalent experience in computer science, engineering, or related field.
- Demonstrated ability to work autonomously and drive complex projects to completion.
- Strong track record of shipping high-impact work in technical environments.
- Excellent asynchronous communication skills and ability to collaborate across time zones.
- Deep expertise in at least one of: systems programming, GPU/accelerator programming, distributed systems, or ML infrastructure.
- Strong technical proficiency in at least two: CUDA kernels (or Triton/TileLang/Pallas), distributed systems (Rust, Go, or C++), Python/PyTorch/LLM systems, or Kubernetes/container orchestration.
- Knowledge of transformer architectures and KV-cache memory management.
- Preferred: Contributions to vLLM or major open-source ML/systems projects.
- Preferred: Experience with multiple accelerator platforms (NVIDIA, AMD, TPU, Intel).
- Preferred: Experience with quantization, kernel optimization, or compiler technologies.
Responsibilities
- Optimize inference runtime performance for LLM and diffusion models across diverse hardware.
- Develop low-level CUDA kernels or equivalent (Triton, TileLang, Pallas) to enhance inference speed.
- Design and implement high-performance distributed systems for serving models at scale.
- Build operational infrastructure for cluster management, deployment automation, and production monitoring.
- Manage KV-cache memory and model serving within the vLLM stack.
- Collaborate asynchronously with global team members while maintaining code ownership.
View Full Description & ApplyYou'll be redirected to the employer's site