- Design, build, and maintain scalable data pipelines using AWS Glue (PySpark), or equivalent orchestration and transformation tools.
- Engineer and optimise the ClickHouse warehouse for sub-second query performance.
- Implement data contracts between back-office and the platform.
- Build the feature-serving layer providing pre-computed features to AI agents at millisecond latency.
- Integrate with third-party databases, back-office APIs, and external systems (CRM, affiliates, acquisition platforms).
- Establish monitoring, alerting, and maintenance procedures including pipeline health checks and anomaly detection.
- Own CI/CD and infrastructure-as-code for data workloads.
SQLApache KafkaClickhouse+4 more