Machine Learning Engineer (Platform)
New
Remote-US; remote role open to candidates who are currently authorized to work either in the United States or in CanadaFull-TimeMiddle
Salary140,000 - 180,000 USD per year
Apply NowOpens the employer's application page
Job Details
- Experience
- 5+ years of industry software engineering experience
- Required Skills
- AWSDockerPythonKubernetesMachine LearningPyTorchTensorflowDistributed Systems
Requirements
- 5+ years of industry software engineering experience
- 4+ years of industry experience using one of PyTorch, TensorFlow, or JAX in Python
- 3+ years of industry experience building with AWS, Docker, and Kubernetes
- 1+ years of industry experience optimizing large-scale, high data-throughput, distributed machine learning training pipelines
Responsibilities
- Accountable for Artera’s ML compute infrastructure including scaling up Artera’s Foundation Model development by developing distributed training infrastructure and developer libraries.
- Build and evolve the core libraries used by AI scientists to develop, launch, and monitor AI products.
- Work with model developers to optimize GPU and CPU efficiency and data throughput of large-scale foundation models and downstream model training runs.
- Optimize Artera’s ability to store and serve terabytes of digital pathology data efficiently for the use in serving large-scale training regimes.
- Ensure that Artera’s observability infrastructure provides a clear picture of how to continue to optimize performance across our model landscape.
View Full Description & ApplyYou'll be redirected to the employer's site