Senior ML Operations (MLOps) Engineer

New
Based in United StatesFull-TimeSenior
SalaryCompetitive compensation with meaningful equity participation
Apply NowOpens the employer's application page

Job Details

Experience
5+ years
Required Skills
AWSPythonPyTorchTensorflowCI/CDMLOpsDistributed Systems

Requirements

  • 5+ years of software engineering experience with a focus on ML infrastructure, distributed systems, or large-scale data processing
  • Strong proficiency in Python and ML frameworks such as PyTorch, TensorFlow, or equivalent
  • Hands-on experience with MLOps workflows, including model training pipelines, orchestration, and CI/CD deployment systems
  • Proven track record of deploying ML models into production at scale with monitoring and feedback systems
  • Strong experience with cloud platforms (AWS preferred), including services for compute, storage, and observability
  • Familiarity with distributed systems, streaming data, and large-scale data processing architectures
  • Strong understanding of system performance optimization, including latency, cost, and scalability trade-offs
  • Experience working in cross-functional teams in fast-paced, product-driven environments
  • Strong communication skills and ability to collaborate effectively in remote settings

Responsibilities

  • Design, build, and maintain scalable ML infrastructure, including data pipelines, training workflows, and model deployment systems
  • Own end-to-end ML lifecycle operations, ensuring reliable delivery of models into production environments at scale
  • Develop and optimize CI/CD pipelines for machine learning workflows, enabling rapid and safe iteration
  • Implement monitoring, telemetry, and feedback loops for ML models running across large-scale device fleets
  • Collaborate with R&D, firmware, backend, and data teams to ensure seamless integration of ML inference systems
  • Build tooling, microservices, and frameworks to improve experimentation, data processing, and deployment efficiency
  • Optimize compute, storage, and infrastructure costs while maintaining high performance and reliability
  • Ensure strong observability and system health across all ML production services
View Full Description & ApplyYou'll be redirected to the employer's site
Competitive compensation with meaningful equity participation
Apply Now