Machine Learning Engineer
New
T
TwilioCommunications Technology
Remote - US, but is not eligible to be hired in CA, CT, NJ, NY, PA, WA.Full-TimeMiddle
Salary not disclosed
Apply NowOpens the employer's application page
Job Details
- Experience
- 5+ years
- Required Skills
- AWSDockerPythonSQLJavaKubeflowKubernetesMachine LearningAirflowMLOps
Requirements
- Strong foundation in ML/AI (statistics, probability, optimization) with the ability to apply these concepts to real-world problems.
- 5+ years of experience building, deploying, and operating data and ML systems in production.
- Proficient in Python, Java, and SQL; strong software engineering fundamentals (system design, testing, version control, code reviews).
- Hands-on experience with workflow orchestration and data pipelines (e.g., Airflow, Kubeflow) and cloud data platforms/storage (e.g., SageMaker Feature Store, Snowflake, DynamoDB, OpenSearch).
- Experience with the ML lifecycle and MLOps tooling (e.g., MLflow, Metaflow, SageMaker; LLM/agent frameworks such as LangChain/LangGraph; model evaluation/observability tools such as Galileo or similar).
- Working knowledge of containerization and cloud infrastructure, including Docker and Kubernetes, GitOps/CI/CD tools (e.g., Argo CD), and at least one major cloud platform (AWS, GCP, or Azure).
- Understanding of data modeling and scalable systems, including distributed computing and streaming frameworks (e.g., Spark/EMR, Flink, Kafka Streams); familiarity with GPU-based implementation is a plus.
- Demonstrated ability to ramp up quickly and operate effectively in new application/business domains.
- Strong written and verbal communication skills: able to document and present designs and decisions, and comfortable giving/receiving feedback in an Agile environment.
Responsibilities
- Partner with product, UX, and technical stakeholders to analyze business problems, clarify requirements, define scope, and translate them into measurable ML problem statements.
- Design, implement, and maintain scalable, enterprise-grade ML solutions in production.
- Build reproducible ML workflows for data preparation, training, evaluation, and inference using modern orchestration and MLOps tooling.
- Implement monitoring and evaluation frameworks to continuously improve data quality, model performance, latency, and cost through feedback loops.
- Partner cross-functionally with Product, Data Science/ML, Engineering, and Security to deliver resilient, scalable, and compliant ML-powered services.
- Demonstrate end-to-end systems understanding and articulate the “why” behind model and system design choices.
- Own operational excellence: SLAs, on-call, incident response, customer feedback triage, and blameless post-mortems.
- Drive engineering excellence via AI-assisted SDLC, code reviews, automated testing, MLOps best practices, knowledge-sharing, and mentoring.
- Actively adopt AI-assisted practices to improve implementation and collaboration efficiency.
View Full Description & ApplyYou'll be redirected to the employer's site