Applyπ South Africa, Mauritius, Kenya, Nigeria
π Technology, Marketplaces
- BSc degree in Computer Science, Information Systems, Engineering, or related technical field or equivalent work experience.
- 3+ years related work experience.
- Minimum of 2 years experience building and optimizing βbig dataβ data pipelines, architectures and maintaining data sets.
- Experienced in Python.
- Experienced in SQL (PostgreSQL, MS SQL).
- Experienced in using cloud services: AWS, Azure or GCP.
- Proficiency in version control, CI/CD and GitHub.
- Understanding/experience in Glue and PySpark highly desirable.
- Experience in managing data life cycle.
- Proficiency in manipulating, processing and architecting large disconnected data sets for analytical requirements.
- Ability to maintain and optimise processes supporting data transformation, data structures, metadata, dependency and workload management.
- Good understanding of data management principles - data quality assurance and governance.
- Strong analytical skills related to working with unstructured datasets.
- Understanding of message queuing, stream processing, and highly scalable βbig dataβ datastores.
- Strong attention to detail.
- Good communication and interpersonal skills.
- Suggest efficiencies and execute on implementation of internal process improvements in automating manual processes.
- Implement enhancements and new features across data systems.
- Improve streamline processes within data systems with support from Senior Data Engineer.
- Test CI/CD process for optimal data pipelines.
- Assemble large, complex data sets that meet functional / non-functional business requirements.
- Highly efficient in ETL processes.
- Develop and conduct unit tests on data pipelines as well as ensuring data consistency.
- Develop and maintain automated monitoring solutions.
- Support reporting and analytics infrastructure.
- Maintain data quality and data governance as well as upkeep of overall maintenance of data infrastructure systems.
- Maintain data warehouse and data lake metadata, data catalogue, and user documentation for internal business users.
- Ensure best practice is implemented and maintained on database.
AWSPostgreSQLPythonSQLETLGitCI/CD
Posted 8 days ago
Apply