AI Research Engineer (Multi-Modal & Vision)
New
IndiaFull-TimeMiddle
Salary not disclosed
Apply NowOpens the employer's application page
Job Details
- Required Skills
- GitHub
Requirements
- Bachelor's degree in Computer Science, Machine Learning, Artificial Intelligence, or a related field (Master's or PhD preferred).
- Strong hands-on experience working with multimodal AI systems, particularly vision-language models.
- Proven expertise in supervised fine-tuning, knowledge distillation, reinforcement learning from feedback, and other post-training optimization techniques.
- Experience with parameter-efficient fine-tuning approaches.
- Experience with distributed training frameworks.
- Demonstrated success improving model performance on industry-standard benchmarks or production use cases.
- Strong understanding of model optimization techniques for resource-constrained environments.
- Experience building scalable machine learning pipelines and training workflows on GPU infrastructure.
- Proven contributions to open-source multimodal AI projects through platforms such as GitHub or Hugging Face.
- Research background supported by publications in leading AI conferences or journals is highly desirable.
Responsibilities
- Conduct end-to-end research and development of vision-language models, including training, evaluation, optimization, and deployment.
- Design and implement advanced post-training methodologies such as supervised fine-tuning, knowledge distillation, and RLHF.
- Build, curate, and maintain high-quality multimodal datasets.
- Improve model efficiency and scalability through optimization and compression techniques.
- Develop benchmarking systems to assess model quality and real-world performance.
- Build and maintain distributed training workflows on GPU infrastructure.
- Monitor emerging research and translate advancements into practical improvements.
View Full Description & ApplyYou'll be redirected to the employer's site