Great knowledge of Python Great knowledge of SQL & DBMS Great knowledge of distributed processing (Apache Spark, Hadoop, Hive, Presto, or similar) Deep understanding of data modelling (Star schema, Snowflake) and manipulation/cleansing Good knowledge of non-relational databases (NoSQL) and semi-structured/unstructured data Experience with AWS environment (S3, Redshift, RDS, SQS, Athena, Glue, CloudWatch, Lambda, or similar) Experience with code versioning (GitHub or similar) Experience in Batch processing (ETL/ELT) Advanced/fluent English