Data Engineer

New
SpainFull-TimeMiddle
Salary not disclosed
Apply NowOpens the employer's application page

Job Details

Languages
English
Experience
4+ years
Required Skills
AWSPythonSQLAirflowSparkdbt

Requirements

  • 4+ years of professional experience in data engineering with strong exposure to large-scale AWS and Spark environments
  • Advanced proficiency in SQL and Python for data processing and transformation at scale
  • Strong experience with AWS data services including S3, Glue, Athena, Redshift, EMR, and orchestration tools
  • Proven experience building and maintaining data models using dbt or similar frameworks
  • Hands-on experience with data quality, validation, and testing frameworks such as Great Expectations
  • Strong understanding of data governance, lineage, and reproducibility in production environments
  • Experience with entity resolution, deduplication, or record linkage across multiple data sources
  • Familiarity with anonymization and pseudonymization techniques in regulated environments
  • Experience working in regulated industries such as BFSI, healthcare, or government is highly valued
  • Ability to work independently or as a lead engineer within a small, fast-moving delivery team
  • Strong written and verbal communication skills in English, with the ability to document and explain complex systems clearly

Responsibilities

  • Rebuild and validate data pipelines to ensure full reproducibility of reporting and descriptive statistics across all datasets
  • Profile, reconcile, and harmonize heterogeneous source schemas across multiple business entities into a unified data model
  • Design and implement dbt-based data models (staging, intermediate, and marts) with strong testing and validation layers
  • Develop and maintain data quality frameworks using tools such as Great Expectations and dbt tests to enforce reliability
  • Build and implement entity resolution and record linkage logic across fragmented customer and account datasets
  • Ensure robust anonymization and pseudonymization processes that meet regulatory and compliance requirements
  • Optimize large-scale Spark-based processing jobs, including partitioning strategies, file formats, and cost-efficient compute usage
  • Orchestrate production-grade pipelines using tools such as Airflow or AWS Step Functions
  • Deliver clean, documented, and feature-ready datasets for downstream data science and risk modelling teams
  • Create clear technical documentation and runbooks to support operational handover and long-term maintainability
View Full Description & ApplyYou'll be redirected to the employer's site
View details
Apply Now