Machine Learning Systems Engineer

Posted 3 months agoViewed
San Francisco Bay AreaNorth AmericaSouth AmericaFull-TimeAI, Machine Learning
Company:RelationalAI
Location:San Francisco Bay Area, North America, South America, EST, PST
Languages:English
Seniority level:Senior, 3+ years
Experience:3+ years
Skills:
DockerPythonKubernetesC++
Requirements:
Proficiency in C++ and Python Deep understanding of HPC concepts (MPI, BSP, Multi-GPU/Multi-node distributed computing) CUDA/ROCm programming experience preferred Solid understanding of gradient descent and backpropagation algorithms Experience with transformer architectures Knowledge of deep learning training and its applications Understanding of distributed training techniques 3+ years of experience in machine learning engineering or research Experience with large-scale distributed training frameworks (Megatron-LM, DeepSpeed, FairScale, etc.) Familiarity with inference optimization frameworks (vLLM, TensorRT, etc.) Experience with containerization (Docker, Kubernetes) and cluster management Background in systems programming and performance optimization Publications in machine learning research preferred Ability to read, understand, and implement techniques from recent ML research papers Demonstrated commitment to open source development and community collaboration
Responsibilities:
Contribute code and performance improvements to the open source project. Develop and optimize distributed training algorithms for large language models. Implement high-performance inference engines and optimization techniques. Work on integration between vLLM, Megatron-LM, and HuggingFace ecosystems. Build tools for seamless model training, fine-tuning, and deployment. Optimize performance of advanced GPU architectures. Collaborate with the open source community on feature development and bug fixes. Research and implement new techniques for self-improving AI agents.
About the Company
RelationalAI
View Company Profile
Similar Jobs:
Posted 4 months ago
United StatesFull-TimeTelevision
Sr. Machine Learning Engineer (Recommendation Systems)
Company:Philo
Posted 15 days ago
United StatesFull-TimeAlternative Investments
Machine Learning Engineer
Company:K1X
Posted about 1 month ago
New York, NYFull-TimeSoftware Development
Machine Learning Engineer
Company:Whalar Group