Apply

Data Engineer

Posted 2 months agoViewed

View full description

πŸ’Ž Seniority level: Senior, At least four years

πŸ“ Location: Italy

πŸ” Industry: Artificial Intelligence

🏒 Company: iGeniusπŸ‘₯ 101-250πŸ’° $20,161,290 Series A over 2 years agoArtificial Intelligence (AI)Business IntelligenceAnalyticsInformation Technology

πŸ—£οΈ Languages: English

⏳ Experience: At least four years

πŸͺ„ Skills: PythonSQLAgileJenkinsAirflowNosqlCollaborationCI/CDScala

Requirements:
  • At least four years of proven experience in a data engineer role.
  • A degree in Computer Science, Applied Math, Informatics, Information Systems or similar.
  • Experience building processes for data transformation, data structure, metadata, dependency, and workload management.
  • Advanced working knowledge of relational databases and familiarity with SQL and NoSQL.
  • Experience building and optimizing data pipelines and architectures.
  • Proficient in Python or Scala.
  • Experience with distributed computing and object-oriented design.
  • Knowledge of pipeline and workflow management tools (e.g., Airflow, Argo Workflows).
  • Experience with CI/CD tools (e.g., Jenkins, Travis, Argo CD, Terraform).
  • Good understanding of Cloud and data models (data warehouse, data lake).
  • Experience with MLOps and HPC is a plus.
Responsibilities:
  • Create and maintain optimal data pipeline architectures.
  • Assemble large, complex data sets that meet functional and business requirements.
  • Identify, design, and implement improvements to internal practices such as automating manual processes.
  • Build the infrastructure required for optimal extraction, transformation, and loading of data from various sources.
Apply

Related Jobs

Apply

πŸ“ US, Europe

🧭 Full-Time

πŸ’Έ 175000.0 - 205000.0 USD per year

πŸ” Cloud computing and AI services

🏒 Company: CoreWeaveπŸ’° $642,000,000 Secondary Market about 1 year agoCloud ComputingMachine LearningInformation TechnologyCloud Infrastructure

  • 5+ years of experience with Kubernetes and Helm, with a deep understanding of container orchestration.
  • Hands-on experience administering and optimizing clustered computing technologies on Kubernetes, such as Spark, Trino, Flink, Ray, Kafka, StarRocks or similar.
  • 5+ years of programming experience in C++, C#, Java, or Python.
  • 3+ years of experience scripting in Python or Bash for automation and tooling.
  • Strong understanding of data storage technologies, distributed computing, and big data processing pipelines.
  • Proficiency in data security best practices and managing access in complex systems.

  • Architect, deploy, and scale data storage and processing infrastructure to support analytics and data science workloads.
  • Manage and maintain data lake and clustered computing services, ensuring reliability, security, and scalability.
  • Build and optimize frameworks and tools to simplify the usage of big data technologies.
  • Collaborate with cross-functional teams to align data infrastructure with business goals and requirements.
  • Ensure data governance and security best practices across all platforms.
  • Monitor, troubleshoot, and optimize system performance and resource utilization.

PythonBashKubernetesApache Kafka

Posted 6 days ago
Apply
Apply

πŸ“ South Africa, Mauritius, Kenya, Nigeria

πŸ” Technology, Marketplaces

  • BSc degree in Computer Science, Information Systems, Engineering, or related technical field or equivalent work experience.
  • 3+ years related work experience.
  • Minimum of 2 years experience building and optimizing β€˜big data’ data pipelines, architectures and maintaining data sets.
  • Experienced in Python.
  • Experienced in SQL (PostgreSQL, MS SQL).
  • Experienced in using cloud services: AWS, Azure or GCP.
  • Proficiency in version control, CI/CD and GitHub.
  • Understanding/experience in Glue and PySpark highly desirable.
  • Experience in managing data life cycle.
  • Proficiency in manipulating, processing and architecting large disconnected data sets for analytical requirements.
  • Ability to maintain and optimise processes supporting data transformation, data structures, metadata, dependency and workload management.
  • Good understanding of data management principles - data quality assurance and governance.
  • Strong analytical skills related to working with unstructured datasets.
  • Understanding of message queuing, stream processing, and highly scalable β€˜big data’ datastores.
  • Strong attention to detail.
  • Good communication and interpersonal skills.

  • Suggest efficiencies and execute on implementation of internal process improvements in automating manual processes.
  • Implement enhancements and new features across data systems.
  • Improve streamline processes within data systems with support from Senior Data Engineer.
  • Test CI/CD process for optimal data pipelines.
  • Assemble large, complex data sets that meet functional / non-functional business requirements.
  • Highly efficient in ETL processes.
  • Develop and conduct unit tests on data pipelines as well as ensuring data consistency.
  • Develop and maintain automated monitoring solutions.
  • Support reporting and analytics infrastructure.
  • Maintain data quality and data governance as well as upkeep of overall maintenance of data infrastructure systems.
  • Maintain data warehouse and data lake metadata, data catalogue, and user documentation for internal business users.
  • Ensure best practice is implemented and maintained on database.

AWSPostgreSQLPythonSQLETLGitCI/CD

Posted 8 days ago
Apply
Apply

πŸ“ North America, South America, Europe

πŸ’Έ 100000 - 500000 USD per year

πŸ” Web3, blockchain

🏒 Company: Edge & NodeπŸ‘₯ 11-50Software

  • A self-motivated, team member with keen attention to detail.
  • Proactive collaboration with team members and a willingness to adapt to a growing environment.
  • Familiarity and experience with Rust, particularly focusing on data transformation and ingestion.
  • A strong understanding of blockchain data structures and ingestion interfaces.
  • Experience in real-time data handling, including knowledge of reorg handling.
  • Familiarity with blockchain clients like Geth and Reth is a plus.
  • Adaptability to a dynamic and fully-remote work environment.
  • Rigorous approach to software development that reflects a commitment to excellence.

  • Develop and maintain data ingestion adapters for various blockchain networks and web3 protocols.
  • Implement data ingestion strategies for both historical and recent data.
  • Apply strategies for handling block reorgs.
  • Optimize the latency of block ingestion at the chain head.
  • Write interfaces with file storage protocols such as IPFS and Arweave.
  • Collaborate with upstream data sources, such as chain clients and tracing frameworks, and monitor the latest upstream developments.
  • Perform data quality checks, cross-checking data across multiple sources and investigating any discrepancies that arise.

Software DevelopmentBlockchainData StructuresRustCollaborationAttention to detail

Posted 2 months ago
Apply
Apply

πŸ“ UK, EU

πŸ” Consultancy

🏒 Company: The Dot CollectiveπŸ‘₯ 11-50Cloud ComputingAnalyticsInformation Technology

  • Advanced knowledge of distributed computing with Spark.
  • Extensive experience with AWS data offerings such as S3, Glue, Lambda.
  • Ability to build CI/CD processes including Infrastructure as Code (e.g. terraform).
  • Expert Python and SQL skills.
  • Agile ways of working.

  • Leading a team of data engineers.
  • Designing and implementing cloud-native data platforms.
  • Owning and managing technical roadmap.
  • Engineering well-tested, scalable, and reliable data pipelines.

AWSPythonSQLAgileSCRUMSparkCollaborationAgile methodologies

Posted 2 months ago
Apply
Apply

πŸ“ Central EU or Americas

🧭 Full-Time

πŸ” Real Estate Investment

🏒 Company: RoofstockπŸ‘₯ 501-1000πŸ’° $240,000,000 Series E almost 3 years agoπŸ«‚ Last layoff almost 2 years agoRental PropertyPropTechMarketplaceReal EstateFinTech

  • BS or MS in a technical field: computer science, engineering or similar.
  • 5+ years technical experience working with data.
  • 5+ strong experience building scalable data services and applications using either SQL, Python, Java / Kotlin.
  • Deep understanding of microservices architecture and RESTful API development including gRPC, REST/SOAP, GraphQL.
  • Experience with AWS services including Messaging such as SQS, SNS, and familiarity with real-time data processing frameworks.
  • Significant experience building and deploying data-related infrastructure, robust data pipelines, and ETL/ELT code.
  • Strong understanding of data architecture and related problems.
  • Experience working on complex problems and distributed systems where scalability and performance are important.
  • Strong communication and interpersonal skills.
  • Independent work and effective collaboration with cross-functional teams.

  • Improve and maintain the data services platform.
  • Deliver high-quality data services promptly, ensuring data governance and integrity while meeting objectives and maintaining SLAs for data sharing across multiple products.
  • Develop effective architectures and produce key code components that contribute to the design, implementation, and maintenance of technical solutions.
  • Integrate a diverse network of third-party tools into a cohesive, scalable platform, optimizing code for enhanced scalability, performance, and readability.
  • Continuously improve system performance and reliability by diagnosing and resolving unexpected operational issues.
  • Ensure the team's work undergoes rigorous testing through repeatable, automated methods.
  • Support the data infrastructure and the rest of the data team in designing, implementing, and deploying scalable, fault-tolerant pipelines.

AWSDockerGraphQLPythonSQLAgileETLJavaKafkaKotlinSCRUMSnowflakeAirflowApache KafkaData engineeringgRPC

Posted 5 months ago
Apply
Apply

πŸ“ Central EU or Americas

🧭 Full-Time

πŸ” Real estate investment

🏒 Company: RoofstockπŸ‘₯ 501-1000πŸ’° $240,000,000 Series E almost 3 years agoπŸ«‚ Last layoff almost 2 years agoRental PropertyPropTechMarketplaceReal EstateFinTech

  • BS or MS in a technical field: computer science, engineering or similar.
  • 8+ years technical experience working with data.
  • 5+ years strong experience building scalable data services and applications using SQL, Python, Java/Kotlin.
  • Deep understanding of microservices architecture and RESTful API development.
  • Experience with AWS services including messaging and familiarity with real-time data processing frameworks.
  • Significant experience building and deploying data-related infrastructure and robust data pipelines.
  • Strong understanding of data architecture and related challenges.
  • Experience with complex problems and distributed systems focusing on scalability and performance.
  • Strong communication and interpersonal skills.
  • Independent worker able to collaborate with cross-functional teams.

  • Improve and maintain the data services platform.
  • Deliver high-quality data services promptly, ensuring data governance and integrity while meeting objectives and maintaining SLAs.
  • Develop effective architectures and produce key code components contributing to technical solutions.
  • Integrate a diverse network of third-party tools into a cohesive, scalable platform.
  • Continuously enhance system performance and reliability by diagnosing and resolving operational issues.
  • Ensure rigorous testing of the team's work through automated methods.
  • Support data infrastructure and collaborate with the data team on scalable data pipelines.
  • Work within an Agile/Scrum framework with cross-functional teams to deliver value.
  • Influence the enterprise data platform architecture and standards.

AWSDockerPythonSQLAgileETLSCRUMSnowflakeAirflowData engineeringgRPCRESTful APIsMicroservices

Posted 5 months ago
Apply