Apply

Machine Learning Engineer

Posted 12 days agoViewed

View full description

πŸ’Ž Seniority level: Middle, Several years

πŸ“ Location: USA, Australia

πŸ” Industry: AI

🏒 Company: Pluralis Research

πŸ—£οΈ Languages: English

⏳ Experience: Several years

πŸͺ„ Skills: PythonMachine LearningPyTorchAlgorithmsData StructuresNetworking

Requirements:
  • Master's degree in Computer Science or related field, or equivalent experience.
  • Strong understanding of model parallelism techniques, distributed training architectures, and optimization methods.
  • Expert-level skills in PyTorch or similar frameworks, with experience scaling models across multiple devices.
  • Familiarity with networking concepts, distributed computing principles, and performance optimization.
Responsibilities:
  • Build and optimize systems for training large models across heterogeneous hardware connected by low-bandwidth networks.
  • Implement techniques to reduce communication overhead while maintaining model convergence in challenging network environments.
  • Design and develop robust training pipelines that can recover from node failures and network disruptions.
  • Create efficient systems for deploying sharded models in a protocol-locked environment.
  • Develop tools to track training progress, evaluate model quality, and identify bottlenecks in distributed environments.
Apply

Related Jobs

Apply

πŸ“ Australia

🧭 Full-Time

πŸ” Software Development

  • More than 5 years of Industry experience in the machine learning/software engineering role with a Product/SaaS company.
  • Experience with industry-level high scale Natural Language Systems.
  • Experience building and deploying machine learning models, including a strong understanding of end-to-end machine learning pipelines and components.
  • Strong coding proficiency in Python (note that interviews will be in Python).
  • Familiar with several of the following: TensorFlow, PyTorch, scikit-learn, Langchain and Huggingface.
  • Worked with RAG architectures and/or have a good understanding of its application.
  • Strong understanding of Computer Science/Engineering fundamentals and first principles covering system design, data structures, architecture, and design patterns.
  • Ideally previously worked in Customer Support and/or Business Process automation role.
  • Excellent collaboration and communication skills. You enjoy pairing with and mentoring other engineers.
  • Proven ability to set medium to long-term vision for the team in the AI space.
  • Partner with our leadership on developing an AI/ML strategy and roadmap to improve the customer support experience.
  • Be accountable for the delivery of the primary identified opportunities in UV, partnering with product and engineering teams.
  • Lead the ML development of a natural language understanding/processing system that integrates seamlessly into the Canva Product.
  • Collaborate with MLEs across the organisation to tap into existing ML capabilities and/or work on cross-organisation problems and solutions.

PythonSQLApache AirflowCloud ComputingDesign PatternsMachine LearningNLTKPyTorchAlgorithmsData engineeringData StructuresTensorflowCommunication SkillsCollaborationProblem SolvingRESTful APIsMentoringData modelingData analytics

Posted about 22 hours ago
Apply
Apply

πŸ“ US

🧭 Full-Time

πŸ” Software Development

🏒 Company: YobiπŸ‘₯ 11-50πŸ’° $2,370,000 Seed about 2 years agoCRMElectronicsBig DataB2BSoftware

  • Can think creatively about data
  • Knows when to use a Data Clean Room
  • Understanding enough about machine learning to be dangerous
  • Worked on and can speak to some kinds of impactful consumer-facing ML problem, e.g. recommender systems, personalization, etc.
  • Building systems to ingest partner data in a privacy-safe way
  • Getting the most out of the data to power the rest of our models and products
  • Collaborating with the Product org
  • Working with customers directly

PythonSQLApache AirflowData AnalysisMachine LearningProduct ManagementAlgorithmsData engineeringSparkCI/CDData modeling

Posted 3 days ago
Apply
Apply

πŸ“ United States

🧭 Full-Time

πŸ” AdTech

🏒 Company: YobiπŸ‘₯ 11-50πŸ’° $2,370,000 Seed about 2 years agoCRMElectronicsBig DataB2BSoftware

  • AdTech experience and product intuition for the space
  • Understanding enough about machine learning
  • Skill and attitude wise, can quickly contribute to things such as orchestration/Airflow, Bazel (build systems, really), CI/CD, Spark (we have both Python and Scala), and other SQL-y data computation backends as needed.
  • Focus on the models, metrics, pipelines, systems, and services that power and deliver excellence via Yobi Applications products
  • Involve a large degree of 0-to-1 development
  • Rely on collaboration with Product, core signals MLEs, and leaning on your own expertise and insight in building holistic ML-powered products.

PythonSQLApache AirflowMachine LearningProduct ManagementAlgorithmsData engineeringData scienceSparkCI/CDRESTful APIsScala

Posted 3 days ago
Apply
Apply

πŸ“ United States, Canada

🧭 Full-Time

πŸ’Έ 127000.0 - 158700.0 USD per year

πŸ” Remote Sensing

🏒 Company: PlanetπŸ‘₯ 501-1000πŸ’° $200,000,000 Post-IPO Equity over 3 years agoπŸ«‚ Last layoff 9 months agoGeospatialRemote SensingBig DataAerospaceAnalyticsSoftware

  • Bachelor's or Master's degree in Computer Science or a related field
  • 4+ years of professional experience in software engineering of which 2+ years of this is experience in developing and designing Computer Vision and/or Machine Learning technologies and systems
  • Proficiency with Python and machine learning frameworks like TensorFlow or PyTorch
  • Proficiency with software engineering best practices such as version control, testing and continuous integration/continuous deployment (CI/CD)
  • Experience with containerization and container orchestration tools like Docker, Kubernetes, Flyte or Temporal
  • Experience implementing model versioning, monitoring and observability systems
  • Establish and maintain machine learning operations workflows for regular data generation
  • Run experiments to evaluate machine learning algorithms
  • ML operations to maintain production algorithms (monitoring, training, benchmarking, deploying, etc)
  • Develop and implement automated testing to ensure the reliability of deployed models
  • Contribute to full-stack development, from backend and APIs to DevOps tasks and occasional front-end work

DockerPythonData AnalysisImage ProcessingKubernetesMachine LearningPyTorchAlgorithmsTensorflowCI/CDRESTful APIsDevOpsSoftware Engineering

Posted 4 days ago
Apply
Apply

πŸ“ United States

🧭 Full-Time

πŸ’Έ 185800.0 - 260100.0 USD per year

🏒 Company: RedditπŸ‘₯ 1001-5000πŸ’° $410,000,000 Series F over 3 years agoπŸ«‚ Last layoff almost 2 years agoNewsContentSocial NetworkSocial Media

  • 2+ years of experience with industry-level deep learning models.
  • 2+ years of experience with mainstream ML frameworks (such as Tensorflow and Pytorch).
  • 3+ years of end-to-end experience of training, evaluating, testing, and deploying industry-level models.
  • 3+ years of experience of orchestrating complicated data generation pipelines on large-scale datasets.
  • Experience working with cross functional stakeholders across research, product & infrastructure to productize ML research
  • Knowledge of large scale search & recommender systems, or modern ads ranking/retrieval/targeting systems is preferred
  • Experience with deep learning, representation learning or transfer learning is preferred
  • Own end-to-end execution of ML-based targeting products like smart targeting expansion, keyword targeting, auto targeting, user lookalikes etc
  • Own offline & online experimentation of ML models for improving targeting products to drive advertiser outcomes
  • Research, implement, test, and launch new model architectures for retrieval using deep learning (GNNs, transformers, two tower models) with a focus on improving advertiser outcomes
  • Drive technical roadmaps and lead day to day project execution, and contribute meaningfully to team vision and strategy
  • Work on large scale data systems, backend services and product integration
  • Collaborate closely with multiple stakeholders cross product, engineering, research and marketing

AWSBackend DevelopmentPythonMachine LearningNumpyPyTorchData engineeringSparkTensorflow

Posted 5 days ago
Apply
Apply

πŸ“ United States

🧭 Full-Time

πŸ’Έ 130000.0 - 200000.0 USD per year

πŸ” Software Development

🏒 Company: Sadaora

  • 5+ years of experience developing, deploying, and maintaining ML models in production environments.
  • Proficiency in Python and common ML libraries (e.g., Scikit-learn, TensorFlow, PyTorch, XGBoost).
  • Strong foundation in statistics, linear algebra, probability, and optimization.
  • Deep understanding of a range of ML techniques (regression, classification, clustering, NLP, deep learning).
  • Experience with cloud platforms such as AWS, GCP, or Azure.
  • Familiarity with containerization and orchestration tools (Docker, Kubernetes).
  • Solid understanding of software engineering principles, version control (Git), and CI/CD workflows.
  • Design, train, and evaluate machine learning models using best-in-class frameworks.
  • Architect scalable ML solutions and pipelines, from feature engineering to deployment.
  • Implement rigorous testing, validation, and monitoring processes to ensure model reliability in production.
  • Work closely with data engineers to shape the data architecture required for robust ML workflows.
  • Build efficient ETL pipelines to clean, preprocess, and transform large-scale datasets.
  • Partner with product managers, engineers, and business stakeholders to define ML use cases.
  • Collaborate with software engineers to integrate ML models into production-grade APIs and applications.
  • Translate complex ML concepts into business-relevant insights and recommendations.
  • Stay current with advancements in machine learning, AI, and related fields.
  • Experiment with new algorithms, architectures, and tools to continuously enhance our capabilities.
  • Contribute to a culture of experimentation, technical excellence, and intellectual curiosity.

AWSDockerPythonETLGCPGitKubernetesMachine LearningMLFlowPyTorchAzureData engineeringTensorflowCI/CD

Posted 7 days ago
Apply
Apply

πŸ“ United States

🧭 Full-Time

πŸ” Software Development

🏒 Company: AnomaloπŸ‘₯ 11-50πŸ’° $10,000,000 Series B 4 months agoData ManagementInformation TechnologySoftware

  • Strong Python expertise (ML engineering, data processing, and API development).
  • Experience deploying GenAI/LLM-based applications (e.g., chatbots, recommendation systems).
  • Familiarity with OpenAI, Anthropic Claude, and similar GenAI platforms.
  • Background in production-scale ML applications, particularly LLMs for enterprise use cases.
  • Develop and deploy GenAI-driven products, including customer-facing applications and internal tools for engineering teams.
  • Work with unstructured data and define product quality standards for a rapidly evolving space.
  • Collaborate with design partners and enterprise customers to understand how ML and LLMs can be applied effectively.
  • Build and refine retrieval-augmented generation (RAG) models and fine-tune LLMs for specific use cases.
  • Own projects from concept to production, ensuring scalability and performance in real-world environments.

AWSPythonMachine LearningAPI testing

Posted 7 days ago
Apply
Apply

πŸ“ United States

🧭 Full-Time

πŸ’Έ 216700.0 - 303400.0 USD per year

πŸ” Software Development

🏒 Company: RedditπŸ‘₯ 1001-5000πŸ’° $410,000,000 Series F over 3 years agoπŸ«‚ Last layoff almost 2 years agoNewsContentSocial NetworkSocial Media

  • 5+ years of experience in machine learning engineering, with a strong focus on recommendation systems, representation learning, and deep learning.
  • Hands-on experience with Graph Neural Networks (GNNs), collaborative filtering, and large-scale embeddings.
  • Proficiency in Python and experience with ML frameworks such as PyTorch Geometric (PyG), Deep Graph Library (DGL), TensorFlow, or JAX.
  • Strong understanding of graph theory, network science, and representation learning techniques.
  • Experience building distributed training and inference systems using ML infrastructure components (data parallelism, model pruning, inference optimization, etc.).
  • Ability to work in a fast-paced environment, balancing innovation with high-quality production deployment.
  • Strong communication skills and the ability to collaborate cross-functionally with engineers, researchers, and product teams.
  • Design and implement scalable, high-performance machine learning models using Graph Neural Networks (GNNs), transformers, and knowledge graph approaches.
  • Develop and optimize large-scale embedding generation pipelines for Reddit’s recommendation systems.
  • Collaborate with ML infrastructure teams to enable efficient distributed training (multi-GPU, model/data parallelism) and low-latency serving.
  • Work closely with cross-functional teams (Ads, Feed Ranking, Content Understanding) to integrate embeddings into various personalization and ranking systems.
  • Drive feature engineering efforts, identifying and curating expressive raw data to enhance model effectiveness.
  • Monitor, evaluate, and improve model performance using A/B testing, offline metrics, and real-time feedback loops.
  • Stay up-to-date with the latest research in GNNs, transformers, and representation learning, bringing new ideas into production.
  • Participate in code reviews, mentor junior engineers, and contribute to technical decision-making.

PythonData AnalysisKerasMachine LearningMLFlowPyTorchAlgorithmsData StructuresTensorflowA/B testing

Posted 7 days ago
Apply
Apply

πŸ“ United States

πŸ’Έ 106000.0 - 120000.0 USD per year

πŸ” Nonprofit

🏒 Company: DataKindπŸ‘₯ 11-50πŸ’° $2,000,000 over 8 years agoArtificial Intelligence (AI)Big DataAnalyticsData Visualization

  • 3+ years of professional work experience in developing and deploying a machine learning product at scale
  • Foundational understanding of machine learning and statistical methods for predictive modeling
  • Expert in Python
  • Experience with cloud computing (GCP preferred)
  • Experience with databases (SQL, Postgres, PySpark, and/or other data query languages)
  • Experience with DataBricks or a similar data intelligence platform
  • Experience with data warehousing, orchestration, integration, and ETL tools
  • Experience with modern source code management and software repository systems (i.e. Git)
  • Experience documenting and implementing RESTful APIs
  • Design, build, test, and maintain machine learning pipeline architectures (70%)
  • Produce high-quality, reusable code for data ingestion, validation, and processing pipelines
  • Architect and implement end-to-end ML pipelines including training, retraining, and inference systems for schools using the SST
  • Design and build APIs to easily access, integrate, and manage data from different sources
  • Ensure data infrastructure is in compliance with data governance and security policies
  • Create comprehensive documentation for data infrastructure and ML pipelines, tailored for both technical and non-technical stakeholders
  • Advance internal analytics reporting and automation capabilities as needed
  • Provide direct data support to partners (15%)
  • Manage initial data lifecycle processes for new school onboarding including ingestion, transfer, audit, and validation
  • Collaborate with data platform partners on integration and data transfer pipelines
  • Provide technical guidance to partners on how to share data formatted in alignment with our data model and with appropriate data governance measures
  • Address partner concerns regarding data security and ensure their specific requirements are satisfied
  • Support data science initiatives through processing, cleaning, and analyzing data as needed
  • Collaborate and contribute across DataKind (15%)
  • Support other data team members through code reviews and knowledge sharing across products
  • Collaborate with the Product, Engineering, and Research teams to ensure seamless integration and alignment of work
  • Effectively communicate project status and manage expectations with internal teams and partner organizations
  • Maintain accurate and current project information in project management tools like Asana

PostgreSQLPythonSQLApache AirflowCloud ComputingETLGCPGitMachine LearningAPI testingData engineeringRESTful APIsData visualizationData modelingData management

Posted 10 days ago
Apply
Apply

πŸ“ United States

πŸ’Έ 145000.0 - 170000.0 USD per year

πŸ” Machine Learning

🏒 Company: JobgetherπŸ‘₯ 11-50πŸ’° $1,493,585 Seed about 2 years agoInternet

  • 3+ years of industry experience in applied machine learning.
  • Master’s degree in Computer Science, Machine Learning, Mathematics, or a similar field.
  • Strong understanding of model building, optimization, and maintenance for production use.
  • Expertise in Python and experience with deep learning frameworks like PyTorch.
  • Hands-on experience in building deep learning models (e.g., transformers) for NLP tasks.
  • Strong grasp of experimental design and independent execution of collection, measurement, and result interpretation.
  • Design, develop, and deploy advanced machine learning algorithms, including retrieval, classification, and generative use cases.
  • Develop reliable and scalable production systems for machine learning models.
  • Maintain reusable codebases for data preprocessing, model training, evaluation, and deployment.
  • Guide and mentor junior engineers in software and machine learning best practices.
  • Work closely with cross-functional teams, including product managers, clinicians, and technical leadership, to ensure ML solutions align with company objectives.

AWSDockerPythonSQLKubernetesMachine LearningPyTorchAlgorithmsREST API

Posted 11 days ago
Apply