Experience working in large Python, Java, Kotlin, or Go codebases Experience running cloud-native Spark systems (e.g. AWS EMR, Databricks, GCP Dataproc) Experience in performance tuning of Spark, Ray, Maestro, or Airflow jobs Knowledge of data formats such as Parquet, Avro, Arrow, Iceberg, or Delta Lake Knowledge of object storage (e.g. S3, GCS) Expertise with cloud-scale query performance, query optimization, query planning, heuristic query execution techniques, and cost-driven optimizations Experience with internals of distributed systems, SQL/NoSQL databases, data lakes, or data warehouses Strong communication skills and ability to write detailed technical specifications Excitement about coaching and mentorship of junior engineers BSc, MS, or PhD in Computer Science or related fields 8+ years of experience in building product software systems