AI Research Engineer (Multi-Modal & Vision)

New
IndiaFull-TimeMiddle
Salary not disclosed
Apply NowOpens the employer's application page

Job Details

Required Skills
GitHub

Requirements

  • Bachelor's degree in Computer Science, Machine Learning, Artificial Intelligence, or a related field (Master's or PhD preferred).
  • Strong hands-on experience working with multimodal AI systems, particularly vision-language models.
  • Proven expertise in supervised fine-tuning, knowledge distillation, reinforcement learning from feedback, and other post-training optimization techniques.
  • Experience with parameter-efficient fine-tuning approaches.
  • Experience with distributed training frameworks.
  • Demonstrated success improving model performance on industry-standard benchmarks or production use cases.
  • Strong understanding of model optimization techniques for resource-constrained environments.
  • Experience building scalable machine learning pipelines and training workflows on GPU infrastructure.
  • Proven contributions to open-source multimodal AI projects through platforms such as GitHub or Hugging Face.
  • Research background supported by publications in leading AI conferences or journals is highly desirable.

Responsibilities

  • Conduct end-to-end research and development of vision-language models, including training, evaluation, optimization, and deployment.
  • Design and implement advanced post-training methodologies such as supervised fine-tuning, knowledge distillation, and RLHF.
  • Build, curate, and maintain high-quality multimodal datasets.
  • Improve model efficiency and scalability through optimization and compression techniques.
  • Develop benchmarking systems to assess model quality and real-world performance.
  • Build and maintain distributed training workflows on GPU infrastructure.
  • Monitor emerging research and translate advancements into practical improvements.
View Full Description & ApplyYou'll be redirected to the employer's site
View details
Apply Now