Machine Learning Engineer (Platform)

New
Remote-US; remote role open to candidates who are currently authorized to work either in the United States or in CanadaFull-TimeMiddle
Salary140,000 - 180,000 USD per year
Apply NowOpens the employer's application page

Job Details

Experience
5+ years of industry software engineering experience
Required Skills
AWSDockerPythonKubernetesMachine LearningPyTorchTensorflowDistributed Systems

Requirements

  • 5+ years of industry software engineering experience
  • 4+ years of industry experience using one of PyTorch, TensorFlow, or JAX in Python
  • 3+ years of industry experience building with AWS, Docker, and Kubernetes
  • 1+ years of industry experience optimizing large-scale, high data-throughput, distributed machine learning training pipelines

Responsibilities

  • Accountable for Artera’s ML compute infrastructure including scaling up Artera’s Foundation Model development by developing distributed training infrastructure and developer libraries.
  • Build and evolve the core libraries used by AI scientists to develop, launch, and monitor AI products.
  • Work with model developers to optimize GPU and CPU efficiency and data throughput of large-scale foundation models and downstream model training runs.
  • Optimize Artera’s ability to store and serve terabytes of digital pathology data efficiently for the use in serving large-scale training regimes.
  • Ensure that Artera’s observability infrastructure provides a clear picture of how to continue to optimize performance across our model landscape.
View Full Description & ApplyYou'll be redirected to the employer's site
140,000 - 180,000 USD per year
Apply Now