AI Performance Optimization Engineer
New
Fully remote work model across the United StatesFull-TimeSenior
Salary not disclosed
Apply NowOpens the employer's application page
Job Details
- Experience
- 6+ years
- Required Skills
- PythonFGPA ArchitectureC++
Requirements
- Bachelor’s or Master’s degree in Computer Science, Engineering, or related field.
- 6+ years of experience in ML systems, performance engineering, or high-performance computing.
- Strong programming skills in Python and C++, with production-level engineering experience.
- Hands-on experience optimizing deep learning workloads on modern GPU architectures.
- Deep understanding of distributed training, inference systems, and model parallelism techniques.
- Experience with profiling tools across CPU, GPU, and distributed systems.
- Strong knowledge of memory hierarchies, communication overheads, and system bottlenecks.
- Familiarity with model compression and optimization techniques and their trade-offs.
- Strong analytical skills with a disciplined, measurement-driven engineering approach.
- Excellent communication skills and ability to collaborate across technical and non-technical teams.
Responsibilities
- Profile and optimize end-to-end AI pipelines to improve throughput, latency, and cost efficiency.
- Identify bottlenecks across compute, memory, networking, and data pipelines, and implement targeted optimizations.
- Develop and tune advanced model optimization techniques such as quantization, sparsity, pruning, and compression.
- Optimize distributed training and inference using parallelism strategies.
- Improve LLM serving performance through techniques such as KV caching, batching, and speculative decoding.
- Drive kernel and compiler-level optimizations using tools like Triton, XLA, TorchInductor, or TVM.
- Build benchmarking frameworks, performance monitoring systems, and regression testing suites.
- Collaborate with cross-functional engineering teams to integrate performance best practices into production systems.
- Evaluate hardware and software technologies and guide adoption decisions.
- Document optimization strategies and contribute to internal knowledge sharing.
View Full Description & ApplyYou'll be redirected to the employer's site