Apply

Data Engineer

Posted 2024-11-13

View full description

๐Ÿ’Ž Seniority level: Middle, 3+ years

๐Ÿ“ Location: Romania

๐Ÿข Company: Awin

๐Ÿ—ฃ๏ธ Languages: EN

โณ Experience: 3+ years

๐Ÿช„ Skills: AWSPythonAgileMachine LearningSalesforceSCRUMStrategyAzureData engineeringSparkCollaboration

Requirements:
  • Bachelor or Masterโ€™s degree in data science, data engineering, or Computer Science with a focus on math and statistics.
  • 3+ years experience as data engineer building data pipelines with Python, Spark or similar.
  • Strong foundational knowledge in computer science principles and statistical methods.
  • Strong experience with cloud technology (AWS or Azure) and creation of data ingestion pipelines.
Responsibilities:
  • Cleanse and preprocess raw text data for analysis and modelling purposes.
  • Build and maintain data ingestion pipelines from various sources like Salesforce to the data lake.
  • Ensure data quality, consistency, efficiency, and integrity during the ingestion and transformation processes.
  • Collaborate with data scientists to select and implement appropriate embedding methods.
Apply

Related Jobs

Apply

๐Ÿ“ North America, South America, Europe

๐Ÿ’ธ 100000 - 500000 USD per year

๐Ÿ” Web3, blockchain

๐Ÿข Company: Edge & Node

  • A self-motivated, team member with keen attention to detail.
  • Proactive collaboration with team members and a willingness to adapt to a growing environment.
  • Familiarity and experience with Rust, particularly focusing on data transformation and ingestion.
  • A strong understanding of blockchain data structures and ingestion interfaces.
  • Experience in real-time data handling, including knowledge of reorg handling.
  • Familiarity with blockchain clients like Geth and Reth is a plus.
  • Adaptability to a dynamic and fully-remote work environment.
  • Rigorous approach to software development that reflects a commitment to excellence.

  • Develop and maintain data ingestion adapters for various blockchain networks and web3 protocols.
  • Implement data ingestion strategies for both historical and recent data.
  • Apply strategies for handling block reorgs.
  • Optimize the latency of block ingestion at the chain head.
  • Write interfaces with file storage protocols such as IPFS and Arweave.
  • Collaborate with upstream data sources, such as chain clients and tracing frameworks, and monitor the latest upstream developments.
  • Perform data quality checks, cross-checking data across multiple sources and investigating any discrepancies that arise.

Software DevelopmentBlockchainData StructuresRustCollaborationAttention to detail

Posted 2024-11-15
Apply
Apply

๐Ÿ“ North America, Latin America, Europe

๐Ÿ” Data consulting

  • Bachelorโ€™s degree in engineering, computer science or equivalent area.
  • 5+ years in related technical roles such as data management, database development, and ETL.
  • Expertise in evaluating and integrating data ingestion technologies.
  • Experience in designing and developing data warehouses with various platforms.
  • Proficiency in building ETL/ELT ingestion pipelines with tools like DataStage or Informatica.
  • Cloud experience on AWS; Azure and GCP experience is a plus.
  • Proficiency in Python scripting; Scala is required.

  • Designing and developing Snowflake Data Cloud solutions.
  • Creating data ingestion pipelines and working on data architecture.
  • Ensuring data governance and security throughout customer projects.
  • Leading technical teams and collaborating with clients on data initiatives.

AWSLeadershipPythonSQLAgileETLOracleSnowflakeData engineeringSparkCollaboration

Posted 2024-11-07
Apply
Apply

๐Ÿ“ Romania

๐Ÿข Company: Jobgether

  • 8+ years of IT experience with at least 4 years working with Azure Databricks.
  • Strong proficiency in Python for data engineering, along with expertise in PySpark and SQL.
  • Knowledge of Azure Data components including Data Factory, Data Lake, SQL Data Warehouse (DW), and Azure SQL.
  • Experience in data modeling, source system analysis, and developing technical designs for data flows.
  • Familiarity with data profiling, cataloging, and mapping processes.
  • Experience with data visualization and exploration tools.

  • Lead the technical planning and execution of data migration, including ingestion, transformation, and storage within Azure Data Factory and Azure Data Lake.
  • Design and implement scalable data pipelines for seamless data movement from various sources using Azure Databricks.
  • Develop reusable frameworks for ingesting large datasets and implement data validation and cleansing mechanisms.
  • Work with real-time streaming technologies to process and ingest data effectively.
  • Provide technical support during and after migration, resolving challenges as they arise.
  • Stay up-to-date on advancements in cloud computing and data engineering, recommending best practices and industry standards.

PythonSQLCloud ComputingAzureData engineering

Posted 2024-11-07
Apply
Apply

๐Ÿ“ Any European country

๐Ÿงญ Full-Time

๐Ÿ” Software development

๐Ÿข Company: Janea Systems

  • Proven experience as a data engineer, preferably with at least 3 or more years of relevant experience.
  • Experience designing cloud native solutions and implementations with Kubernetes.
  • Experience with Airflow or similar pipeline orchestration tools.
  • Strong Python programming skills.
  • Experience collaborating with Data Science and Engineering teams in production environments.
  • Solid understanding of SQL and relational data modeling schemas.
  • Preference for experience with Databricks or Spark.
  • Familiarity with modern data stack design and data lifecycle management.
  • Experience with distributed systems, microservices architecture, and cloud platforms like AWS, Azure, Google Cloud.
  • Excellent problem-solving skills and strong communication skills.

  • Develop and maintain data pipelines using Databricks, Airflow, or similar orchestration systems.
  • Design and implement cloud-native solutions using Kubernetes for high availability.
  • Gather product data requirements and implement solutions to ingest and process data for applications.
  • Collaborate with Data Science and Engineering teams to optimize production-ready applications.
  • Cultivate data from various sources for data scientists and maintain documentation.
  • Design modern data stack for data scientists and ML engineers.

AWSPythonSoftware DevelopmentSQLKubernetesAirflowAzureData scienceSparkCollaboration

Posted 2024-11-07
Apply
Apply

๐Ÿ“ UK, EU

๐Ÿ” Consultancy

๐Ÿข Company: The Dot Collective

  • Advanced knowledge of distributed computing with Spark.
  • Extensive experience with AWS data offerings such as S3, Glue, Lambda.
  • Ability to build CI/CD processes including Infrastructure as Code (e.g. terraform).
  • Expert Python and SQL skills.
  • Agile ways of working.

  • Leading a team of data engineers.
  • Designing and implementing cloud-native data platforms.
  • Owning and managing technical roadmap.
  • Engineering well-tested, scalable, and reliable data pipelines.

AWSPythonSQLAgileSCRUMSparkCollaborationAgile methodologies

Posted 2024-11-07
Apply
Apply

๐Ÿ“ Romania

๐Ÿ” Big Data

๐Ÿข Company: CREATEQ

  • Several years of experience developing in a modern programming language, preferably Java and Python.
  • Significant experience with developing and maintaining distributed big data systems with production quality deployment and monitoring.
  • Exposure to high-performance data pipelines, preferably with Apache Kafka & Spark.
  • Experience with scheduling systems such as Airflow, and SQL/NoSQL databases.
  • Experience with cloud data platforms is a plus.
  • Exposure to Docker and/or Kubernetes is preferred.
  • Good command of spoken and written English.
  • University degree in computer sciences or equivalent professional experience.

  • Develop new data pipelines and maintain the data ecosystem focusing on fault-tolerant ingestion, storage, data lifecycle, and computing metrics.
  • Communicate efficiently with team members to develop software and creative solutions to meet customer needs.
  • Write high-quality, reusable code, test it, and deploy it to production.
  • Apply best practices according to industry standards while promoting a culture of agility and excellence.

PythonSQLJavaKafkaAirflowApache KafkaNosqlSpark

Posted 2024-10-15
Apply
Apply

๐Ÿ“ Moldova, Romania, neighboring European countries

๐Ÿงญ Full-Time

๐Ÿ” Design platform

๐Ÿข Company: Mixbook๐Ÿ‘ฅ 51-100๐Ÿ’ฐ $10.0m Series B on 2011-08-01E-CommerceSocial NetworkConsumer Goods

  • Experience with Analytics Data Engineering: data modeling, data mart design, data transformations, data governance, data lineage, and data observability.
  • Bachelorโ€™s Degree in Computer Science or another related college degree.
  • A solid grasp of Analytics Data engineering principles, practices, frameworks, and methodologies.
  • 4+ years of experience as an Analytics Engineer, Data Engineer, Data Infrastructure Engineer or Software Engineer.
  • Expert SQL, Python, and database programming skills. Ruby, and/or other coding languages are a plus.
  • Proven experience with AWS or similar cloud databases, and data management frameworks and tools (i.e., dbt, coalesce, Monte Carlo).
  • Experience working with Looker PDTs and LookML is a plus.
  • Analytical mindset and excellent communication skills.
  • Able to partner with stakeholders across various levels of expertise.

  • Select and implement transformation and governance platform tools, identifying new technologies and features that will improve our data platforms.
  • Design data mart models with business users in mind. Develop data product specifications that enable business, analytics, and engineering teams.
  • Collaborate with engineering and analytics on the design, testing, and implementation of efficient data systems.
  • Establish observability best practices. Verify data integrity and discrepancies.
  • Partner with business, analytics, and engineering teams to direct instrumentation needs and integrate into established data pipelines.
  • Develop transformation code and deploy to production following agile best practices.
  • Support engineering efforts to optimize database performance and uptime.

AWSPythonSQLAgileETLData engineeringCommunication Skills

Posted 2024-09-20
Apply
Apply
๐Ÿ”ฅ Data Engineer
Posted 2024-08-26

๐Ÿ“ Americas, EMEA, APAC

๐Ÿ” Crypto and blockchain technology

  • 4+ years of work experience in relevant fields such as Data Engineer, DWH Engineer, or Software Engineer.
  • Experience with data warehouse technologies and relevant data modeling best practices like Presto, Athena, Glue, etc.
  • Experience building data pipelines/ETL and familiarity with design principles; knowledge of Apache Airflow is a plus.
  • Excellent SQL and data manipulation skills using frameworks like Spark/PySpark or similar.
  • Proficiency in a major programming language such as Scala, Python, or Golang.
  • Experience with business requirements gathering for data sourcing.

  • Build scalable and reliable data pipeline that collects, transforms, loads, and curates data from internal systems.
  • Augment data platform with data pipelines from select external systems.
  • Ensure high data quality for pipelines built and maintain auditability.
  • Drive data systems to approach real-time processing.
  • Support the design and deployment of a distributed data store as the central source of truth.
  • Build data connections to internal IT systems.
  • Develop and customize self-service tools for data consumers.
  • Evaluate new technologies and create prototypes for continuous improvements in data engineering.

PythonSQLETLAirflowData engineeringSpark

Posted 2024-08-26
Apply
Apply

๐Ÿ“ Central EU or Americas

๐Ÿงญ Full-Time

๐Ÿ” Real Estate Investment

๐Ÿข Company: Roofstock๐Ÿ‘ฅ 501-1000๐Ÿ’ฐ $240.0m Series E on 2022-03-10๐Ÿซ‚ on 2023-03-22Rental PropertyPropTechMarketplaceReal EstateFinTech

  • BS or MS in a technical field: computer science, engineering or similar.
  • 5+ years technical experience working with data.
  • 5+ strong experience building scalable data services and applications using either SQL, Python, Java / Kotlin.
  • Deep understanding of microservices architecture and RESTful API development including gRPC, REST/SOAP, GraphQL.
  • Experience with AWS services including Messaging such as SQS, SNS, and familiarity with real-time data processing frameworks.
  • Significant experience building and deploying data-related infrastructure, robust data pipelines, and ETL/ELT code.
  • Strong understanding of data architecture and related problems.
  • Experience working on complex problems and distributed systems where scalability and performance are important.
  • Strong communication and interpersonal skills.
  • Independent work and effective collaboration with cross-functional teams.

  • Improve and maintain the data services platform.
  • Deliver high-quality data services promptly, ensuring data governance and integrity while meeting objectives and maintaining SLAs for data sharing across multiple products.
  • Develop effective architectures and produce key code components that contribute to the design, implementation, and maintenance of technical solutions.
  • Integrate a diverse network of third-party tools into a cohesive, scalable platform, optimizing code for enhanced scalability, performance, and readability.
  • Continuously improve system performance and reliability by diagnosing and resolving unexpected operational issues.
  • Ensure the team's work undergoes rigorous testing through repeatable, automated methods.
  • Support the data infrastructure and the rest of the data team in designing, implementing, and deploying scalable, fault-tolerant pipelines.

AWSDockerGraphQLPythonSQLAgileETLJavaKafkaKotlinSCRUMSnowflakeAirflowApache KafkaData engineeringgRPC

Posted 2024-08-10
Apply
Apply
๐Ÿ”ฅ Data Engineer
Posted 2024-07-18

๐Ÿ“ Africa, United Kingdom, Europe, Middle East

๐Ÿงญ Full-Time

๐Ÿ” Sports and Digital Entertainment

  • 4+ years of experience in a data engineering or similar role.
  • Excellent programming skills in Python and Spark (PySpark / Databricks).
  • 2+ years' experience with Databricks and Azure data services.
  • Experience with other cloud-based data management environments (AWS, Google Cloud, etc.) is an advantage.
  • Experience working with Customer Data Platforms is a plus.
  • Knowledge of managing data quality, including monitoring and alerting.
  • Good understanding of application and database development lifecycles.
  • Experience with remote working and ideally with hyper-growth startups.

  • Building and managing a highly robust and scalable Data Lake/ETL infrastructure.
  • Creating a scalable data pipeline for streaming and batch processing.
  • Ensuring data integrity through fault-tolerant systems and automated data quality monitoring.
  • Continuously improving processes and optimizing performance and scalability.
  • Ensuring privacy and data security are prioritized.
  • Documenting the Data Platform stack comprehensively.
  • Partnering with business stakeholders and product engineering to deliver data products.
  • Collaborating with stakeholders to shape requirements and drive the data platform roadmap.

PythonAgileETLAzureData engineeringSparkDocumentation

Posted 2024-07-18
Apply