Mapping BM units, substations, and fuel types across Elexon, National Grid, and other sources; maintaining master/reference datasets Documenting mappings, assumptions, and known limitations clearly for downstream users Reconciling legacy and current data formats; ensuring consistency between Elexon message types Investigating discrepancies between data sources to determine authoritative values Cleaning time-series data: detecting outliers, filling gaps, resolving duplicates, and understanding root causes of quality issues Developing reusable Python-based cleaning routines and data grabbers for APIs Building dbt models and designing PostgreSQL schemas for clean, analysis-ready datasets Orchestrating workflows using GitHub Actions