Senior Data Engineer, Data Lakehouse Infrastructure

T
TRM LabsBlockchain Analytics
North America, EST/PSTFull-TimeSenior
Salary190000 - 220000 USD per year
Apply NowOpens the employer's application page

Job Details

Experience
5+ years
Required Skills
PythonSQLApache AirflowGCPKafkaSnowflakeSparkBigQuery

Requirements

  • 5+ years of experience in data or software engineering, with a focus on distributed data systems and cloud-native architectures
  • Proven experience building and scaling data platforms on GCP, including storage, compute, orchestration, and monitoring
  • Strong command of one or more query engines such as Trino, Presto, Spark, or Snowflake
  • Experience with modern table formats like Apache Hudi, Iceberg, or Delta Lake
  • Exceptional programming skills in Python
  • Adeptness in SQL or SparkSQL
  • Hands-on experience orchestrating workflows with Airflow
  • Building streaming/batch pipelines using GCP-native services

Responsibilities

  • Design, implement, and scale core components of our lakehouse architecture
  • Have ownership over data modeling, ingestion, query performance optimization, and metadata management
  • Architect and scale a high-performance data lakehouse on GCP, leveraging technologies like StarRocks, Apache Iceberg, GCS, BigQuery, Dataproc, and Kafka
  • Design, build, and optimize distributed query engines such as Trino, Spark, or Snowflake to support complex analytical workloads
  • Implement metadata management in open table formats like Iceberg and data discovery frameworks for governance and observability using Iceberg compatible catalogs
  • Develop and orchestrate robust ETL/ELT pipelines using Apache Airflow, Spark, and GCP-native tools (e.g., Dataflow, Composer)
  • Collaborate across departments, partnering with data scientists, backend engineers, and product managers to design and implement
View Full Description & ApplyYou'll be redirected to the employer's site
190000 - 220000 USD per year
Apply Now