Senior Machine Learning Operations Engineer II
L
Life360Consumer Technology
All positions, unless otherwise specified, can be performed remotely (within the US and Canada)Full-TimeSenior
Salary148,000 - 216,000 USD per year
Apply NowOpens the employer's application page
Job Details
- Experience
- 5+ years of professional software engineering, DevOps, or data engineering experience, with at least 2 years dedicated to building and maintaining MLOps infrastructure.
- Required Skills
- AWSDockerPythonSQLKubeflowKubernetesMLFlowSparkTerraformMLOps
Requirements
- 5+ years of professional software engineering, DevOps, or data engineering experience
- At least 2 years dedicated to building and maintaining MLOps infrastructure
- Strong proficiency in Python
- Experience with containerization (Docker) and container orchestration platforms (Kubernetes: EKS, GKE)
- Familiarity with FastAPI
- Experience with ML lifecycle and data processing tools: MLflow, Kubeflow, SparkML, SQL, Spark/PySpark, dbt, and Airflow
- Practical experience operating within a major cloud ecosystem (AWS, GCP, Databricks)
- Bachelor’s or Master’s degree in Computer Science, Data Science, Software Engineering, or a closely related quantitative field
Responsibilities
- Design, implement, and manage automated CI/CD and Continuous Training (CT) pipelines for machine learning model development, evaluation, and delivery.
- Containerize, deploy, and scale machine learning models as high-availability microservices or batch processing workflows.
- Establish unified logging, alerting, and monitoring solutions to track model inference performance, system latency, resource utilization, data drift, and concept drift.
- Provision and optimize cloud-based ML infrastructure (including GPU/CPU computing clusters) utilizing Infrastructure as Code (IaC) paradigms.
- Work intimately with product development teams to drive infrastructure adoption and efficiency gains through SDK/API development, automation and efficient ML system maintenance.
- Implement robust lineage tracking for data, code, and model artifacts to ensure compliance, reproducibility, and security across the entire ML lifecycle.
- Work with data engineering to improve the data ecosystem, ensuring robust, scalable pipelines for experimentation and ML.
- Act as a mentor and thought leader, helping to define best practices in machine learning engineering, scalable ML service ops, and agentic AI best practices.
View Full Description & ApplyYou'll be redirected to the employer's site