Senior Machine Learning Systems Engineer

New
R
RedditMachine Learning
Remote - United StatesFull-TimeSenior
Salary216,700 - 303,400 USD per year
Apply NowOpens the employer's application page

Job Details

Experience
5+ years
Required Skills
PythonGCPKubernetesPyTorchSparkTensorflowTerraformBigQueryMLOps

Requirements

  • 5+ years of experience in ML infrastructure, including model training and deployments.
  • Hands-on experience with ML optimization, memory, and GPU profiling.
  • Experience with cloud-based ML technologies (GCP BigQuery, Google Cloud Storage, Terraform).
  • Experience administering MLOps tools like MLflow or Wandb.
  • Proficiency in Python, PyTorch, or Tensorflow.
  • Experience with distributed training frameworks like Ray and Kubernetes.
  • Strong organizational and communication skills.

Responsibilities

  • Lead development of a platform for large scale ML models at Reddit.
  • Design end-to-end model lifecycle patterns (MLOps) including data preparation, model management, and experiment tracking.
  • Develop and support a graph ML codebase and platform for scalability.
  • Collaborate with ML engineers to improve model training time, efficiency, and GPU costs.
  • Optimize batch data processing using tools like Apache Beam, Apache Spark, and Ray Data.
  • Architect pipelines for massive graph data structures.
View Full Description & ApplyYou'll be redirected to the employer's site
216,700 - 303,400 USD per year
Apply Now