Data Engineer
New
SpainFull-TimeMiddle
Salary not disclosed
Apply NowOpens the employer's application page
Job Details
- Languages
- English
- Experience
- 4+ years
- Required Skills
- AWSPythonSQLAirflowSparkdbt
Requirements
- 4+ years of professional experience in data engineering with strong exposure to large-scale AWS and Spark environments
- Advanced proficiency in SQL and Python for data processing and transformation at scale
- Strong experience with AWS data services including S3, Glue, Athena, Redshift, EMR, and orchestration tools
- Proven experience building and maintaining data models using dbt or similar frameworks
- Hands-on experience with data quality, validation, and testing frameworks such as Great Expectations
- Strong understanding of data governance, lineage, and reproducibility in production environments
- Experience with entity resolution, deduplication, or record linkage across multiple data sources
- Familiarity with anonymization and pseudonymization techniques in regulated environments
- Experience working in regulated industries such as BFSI, healthcare, or government is highly valued
- Ability to work independently or as a lead engineer within a small, fast-moving delivery team
- Strong written and verbal communication skills in English, with the ability to document and explain complex systems clearly
Responsibilities
- Rebuild and validate data pipelines to ensure full reproducibility of reporting and descriptive statistics across all datasets
- Profile, reconcile, and harmonize heterogeneous source schemas across multiple business entities into a unified data model
- Design and implement dbt-based data models (staging, intermediate, and marts) with strong testing and validation layers
- Develop and maintain data quality frameworks using tools such as Great Expectations and dbt tests to enforce reliability
- Build and implement entity resolution and record linkage logic across fragmented customer and account datasets
- Ensure robust anonymization and pseudonymization processes that meet regulatory and compliance requirements
- Optimize large-scale Spark-based processing jobs, including partitioning strategies, file formats, and cost-efficient compute usage
- Orchestrate production-grade pipelines using tools such as Airflow or AWS Step Functions
- Deliver clean, documented, and feature-ready datasets for downstream data science and risk modelling teams
- Create clear technical documentation and runbooks to support operational handover and long-term maintainability
View Full Description & ApplyYou'll be redirected to the employer's site