ApplyML Engineer - Inference
Posted 4 months agoViewed
View full description
Requirements:
- Strong background in machine learning, with hands-on experience in developing and deploying inference models.
- Proficiency in Python and machine learning frameworks such as TensorFlow or PyTorch.
- Experience with optimization techniques including quantization, pruning, and model compression.
- Strong problem-solving skills for troubleshooting complex technical issues.
- Excellent communication and collaboration skills in a remote team environment.
- Passion for learning and staying updated on advancements in ML inference technologies.
- Experience with ML workflow libraries like CUDNN/TensorRT, ROCm, OpenVino, or OpenPPL; knowledge of ML communication frameworks like NCCL is an advantage.
Responsibilities:
- Design and implement efficient algorithms and models for real-time inference in Conversation AI applications, with a focus on Nebula.
- Collaborate with cross-functional teams to integrate machine learning models into production systems, ensuring scalability, reliability, and performance.
- Optimize and fine-tune machine learning models for resource-constrained environments.
- Develop monitoring and evaluation mechanisms to assess performance of inference models.
- Stay updated on advancements in machine learning inference techniques and incorporate new approaches as needed.
- Contribute to the documentation of best practices for implementing and deploying ML inference solutions.
Apply