Senior Machine Learning Engineer - Vision-Language Models

New
I
InspirenSenior Living
Remote, US or Canada. NYC preferred.Full-TimeSenior
Salary200,000 - 230,000 USD per year
Apply NowOpens the employer's application page

Job Details

Experience
5+ years
Required Skills
PythonPyTorchPrompt Engineering

Requirements

  • 5+ years of experience in machine learning engineering
  • Hands-on work in computer vision and/or LLM/VLM systems
  • Strong familiarity with Vision-Language Models (GPT-4V, Claude's vision, Gemini, LLaVA, or similar)
  • Experience building and evaluating labeling systems at scale
  • Solid understanding of how edge/device constraints shape data availability for cloud-side models
  • Proficiency with Python
  • Proficiency with PyTorch
  • Proficiency with cloud inference APIs
  • Proficiency with tools for experiment tracking and evaluation
  • Practical prompt engineering skills
  • Pragmatic engineering mindset
  • Comfortable scoping and driving work independently in a fast-moving, early-stage environment
  • Strong communication skills and a collaborative approach

Responsibilities

  • Design, build, and iterate on VLM-based pipelines that generate high-quality labels and annotations at scale, including prompt engineering, fine-tuning, and evaluation
  • Determine which signals, frames, metadata, and contextual features should be sent from edge devices to improve VLM accuracy and reduce ambiguity
  • Collaborate with Embedded Systems and Hardware teams to define device-side preprocessing and data-forwarding strategies that balance bandwidth, latency, and model performance
  • Collaborate with Data Science to build robust evaluation frameworks to measure label quality, model accuracy, and regression detection
  • Benchmark and integrate commercial and open-source VLMs, staying current on the fast-moving landscape of vision-language capabilities
View Full Description & ApplyYou'll be redirected to the employer's site
200,000 - 230,000 USD per year
Apply Now