Senior Machine Learning Systems Engineer
New
R
RedditMachine Learning
Remote - United StatesFull-TimeSenior
Salary216,700 - 303,400 USD per year
Apply NowOpens the employer's application page
Job Details
- Experience
- 5+ years
- Required Skills
- PythonGCPKubernetesPyTorchSparkTensorflowTerraformBigQueryMLOps
Requirements
- 5+ years of experience in ML infrastructure, including model training and deployments.
- Hands-on experience with ML optimization, memory, and GPU profiling.
- Experience with cloud-based ML technologies (GCP BigQuery, Google Cloud Storage, Terraform).
- Experience administering MLOps tools like MLflow or Wandb.
- Proficiency in Python, PyTorch, or Tensorflow.
- Experience with distributed training frameworks like Ray and Kubernetes.
- Strong organizational and communication skills.
Responsibilities
- Lead development of a platform for large scale ML models at Reddit.
- Design end-to-end model lifecycle patterns (MLOps) including data preparation, model management, and experiment tracking.
- Develop and support a graph ML codebase and platform for scalability.
- Collaborate with ML engineers to improve model training time, efficiency, and GPU costs.
- Optimize batch data processing using tools like Apache Beam, Apache Spark, and Ray Data.
- Architect pipelines for massive graph data structures.
View Full Description & ApplyYou'll be redirected to the employer's site