Apply

Lead Machine Learning Infrastructure Engineer - Infrastructure & Data

Posted 4 days agoViewed

View full description

💎 Seniority level: Lead

📍 Location: United States

💸 Salary: 185500.0 - 293750.0 USD per year

🔍 Industry: Software Development

🏢 Company: Upwork👥 501-1000💰 about 8 years ago🫂 Last layoff almost 2 years agoMarketplaceFreelanceCopywritingPeer to Peer

🗣️ Languages: English

🪄 Skills: AWSDockerLeadershipPythonSoftware DevelopmentSQLCloud ComputingJavaKubeflowKubernetesMachine LearningMLFlowAlgorithmsData engineeringData StructuresREST APICollaborationCI/CDProblem SolvingMentoringLinuxDevOpsTerraformExcellent communication skillsScalaData modeling

Requirements:
  • Strong technical expertise in designing and building scalable ML infrastructure.
  • Experience with distributed systems and cloud-based ML platforms.
  • Proficiency in programming languages such as Python, Java, or Scala.
  • Deep understanding of ML workflows, including data pipelines, model training, and deployment.
  • Passion for innovation and eagerness to implement the latest advancements in ML infrastructure.
  • Strong problem-solving skills and ability to optimize complex systems for performance and reliability.
  • Collaborative mindset with excellent communication skills to work across teams.
  • Ability to thrive in a fast-paced, dynamic environment with evolving technical challenges.
Responsibilities:
  • Design, implement, and optimize distributed systems and infrastructure components to support large-scale machine learning workflows, including data ingestion, feature engineering, model training, and serving.
  • Develop and maintain frameworks, libraries, and tools that streamline the end-to-end machine learning lifecycle, from data preparation and experimentation to model deployment and monitoring.
  • Architect and implement highly available, fault-tolerant, and secure systems that meet the performance and scalability requirements of production machine learning workloads.
  • Collaborate with machine learning researchers and data scientists to understand their requirements and translate them into scalable and efficient software solutions.
  • Stay current with advancements in machine learning infrastructure, distributed computing, and cloud technologies, integrating them into our platform to drive innovation.
  • Mentor junior engineers, conduct code reviews, and uphold engineering best practices to ensure the delivery of high-quality software solutions.
Apply

Related Jobs

Apply

📍 United States

🧭 Full-Time

💸 185500.0 - 293750.0 USD per year

🔍 Software Development

  • Strong technical expertise in designing and building scalable ML infrastructure.
  • Experience with distributed systems and cloud-based ML platforms.
  • Proficiency in programming languages such as Python, Java, or Scala.
  • Deep understanding of ML workflows, including data pipelines, model training, and deployment.
  • Passion for innovation and eagerness to implement the latest advancements in ML infrastructure.
  • Strong problem-solving skills and ability to optimize complex systems for performance and reliability.
  • Collaborative mindset with excellent communication skills to work across teams.
  • Ability to thrive in a fast-paced, dynamic environment with evolving technical challenges.
  • Design, implement, and optimize distributed systems and infrastructure components to support large-scale machine learning workflows, including data ingestion, feature engineering, model training, and serving.
  • Develop and maintain frameworks, libraries, and tools that streamline the end-to-end machine learning lifecycle, from data preparation and experimentation to model deployment and monitoring.
  • Architect and implement highly available, fault-tolerant, and secure systems that meet the performance and scalability requirements of production machine learning workloads.
  • Collaborate with machine learning researchers and data scientists to understand their requirements and translate them into scalable and efficient software solutions.
  • Stay current with advancements in machine learning infrastructure, distributed computing, and cloud technologies, integrating them into our platform to drive innovation.
  • Mentor junior engineers, conduct code reviews, and uphold engineering best practices to ensure the delivery of high-quality software solutions.

AWSDockerPythonCloud ComputingJavaKubernetesMachine LearningSoftware ArchitectureAlgorithmsData engineeringCI/CDScala

Posted 5 days ago
Apply