Staff Machine Learning Engineer
New
U.S.Full-TimeStaff
Salary220000 - 280000 USD per year
Apply NowOpens the employer's application page
Job Details
- Experience
- 7+ years
- Required Skills
- AWSPythonSQLGCPKafkaKubeflowMLFlowC++GoRustBigQueryDatabricks
Requirements
- 7+ years of experience in Machine Learning Engineering or Backend Engineering, with a proven track record of deploying and maintaining complex ML models in high-traffic production environments.
- 3+ years of technical leadership, acting as a lead and driving architecture decisions for consumer applications or scalable backend platforms.
- Experience with Real-Time Data: Proficient in streaming architectures (Kafka/Flink/PubSub) and building low-latency services to serve model inference in <100ms.
- MLOps Expertise: Deep experience managing the full ML lifecycle (training, deploying, monitoring) using tools like MLFlow, Kubeflow, Databricks, or SageMaker.
- Strong Coding Skills: Expert in Python and SQL; proficiency in Go, C++, or Rust is a strong plus for building high-performance inference layers.
- Cloud Native: Deep experience with GCP services (BigQuery, Cloud Functions, GKE, Vertex AI) or AWS equivalents.
Responsibilities
- Architect Scalable ML Systems: Design and build the end-to-end machine learning infrastructure, transitioning experimental Data Science models into robust, high-availability production services.
- Real-Time Inference at Scale: Steer the design and deployment of low-latency services to serve model inferences in milliseconds. You will power real-time decisions across the platform, from dynamic oddsmaking and risk analysis to smart deposit defaults.
- Feature Engineering & Data Strategy: Partner with Data Science to build scalable logging and data pipelines. You will lead the creation and optimization of a centralized feature store required to train complex models across diverse business domains.
- End-to-End MLOps Leadership: Champion best practices for model deployment, monitoring, and CI/CD for ML. You will implement automated retraining pipelines and observability tools to ensure data drift and model degradation are caught and addressed instantly.
View Full Description & ApplyYou'll be redirected to the employer's site