Senior Machine Learning Engineer - Vision-Language Models
New
I
InspirenSenior Living
Remote, US or Canada. NYC preferred.Full-TimeSenior
Salary200,000 - 230,000 USD per year
Apply NowOpens the employer's application page
Job Details
- Experience
- 5+ years
- Required Skills
- PythonPyTorchPrompt Engineering
Requirements
- 5+ years of experience in machine learning engineering
- Hands-on work in computer vision and/or LLM/VLM systems
- Strong familiarity with Vision-Language Models (GPT-4V, Claude's vision, Gemini, LLaVA, or similar)
- Experience building and evaluating labeling systems at scale
- Solid understanding of how edge/device constraints shape data availability for cloud-side models
- Proficiency with Python
- Proficiency with PyTorch
- Proficiency with cloud inference APIs
- Proficiency with tools for experiment tracking and evaluation
- Practical prompt engineering skills
- Pragmatic engineering mindset
- Comfortable scoping and driving work independently in a fast-moving, early-stage environment
- Strong communication skills and a collaborative approach
Responsibilities
- Design, build, and iterate on VLM-based pipelines that generate high-quality labels and annotations at scale, including prompt engineering, fine-tuning, and evaluation
- Determine which signals, frames, metadata, and contextual features should be sent from edge devices to improve VLM accuracy and reduce ambiguity
- Collaborate with Embedded Systems and Hardware teams to define device-side preprocessing and data-forwarding strategies that balance bandwidth, latency, and model performance
- Collaborate with Data Science to build robust evaluation frameworks to measure label quality, model accuracy, and regression detection
- Benchmark and integrate commercial and open-source VLMs, staying current on the fast-moving landscape of vision-language capabilities
View Full Description & ApplyYou'll be redirected to the employer's site