Apply

Research Engineer - Data

Posted 13 days agoViewed

View full description

📍 Location: United Kingdom

🔍 Industry: AI

🏢 Company: Leonardo.Ai

🗣️ Languages: English

🪄 Skills: AWSDockerPostgreSQLPythonSQLCloud ComputingData AnalysisETLHadoopImage ProcessingKubernetesMachine LearningMLFlowMongoDBMySQLPyTorchAlgorithmsAzureCassandraData engineeringData scienceData StructuresREST APINosqlSparkCI/CDJSONData modeling

Requirements:
  • Hands-on experience with images, videos, 3D geometry (mesh/solid modeling), and/or text data. Well-rounded expertise in Python and PyTorch
  • Passion for synthetic data generation making use of inference of pretrained models, 3D rendering engines, and/or other softwares
  • Demonstrated proficiency in setting up large-scale, robust data pipelines, using frameworks like Spark, Ray, or Metaflow. Comfortable with model versioning, and experiment tracking
  • Good understanding of parallel and distributed computing. Experienced with setting up evaluation methods
  • Experience with AWS, Azure, or other cloud platforms. Proficient in both relational (MySQL, PostgreSQL) and NoSQL (MongoDB, Cassandra) databases, plus vector data stores
Responsibilities:
  • Lead the ingestion, unification, and organization of large, unstructured data sources
  • Develop and optimize distributed systems for data processing
  • Build and orchestrate pipelines to generate synthetic data at scale
  • Design and conduct experiments on dataset quality, scalability, and performance
  • Collaborate with legal and safety teams to ensure all data usage respects privacy, security, and ethical standards
  • Contribute to internal and external libraries or frameworks
Apply