Senior Data Engineer

Location: Remote, 8am-5pm Eastern TimeFull-TimeSenior

Salary not disclosed

Apply NowOpens the employer's application page

Job Details

Experience: 6+ years
Required Skills: AWSPostgreSQLPythonApache AirflowETLData engineeringCloudFormationPySpark

Bachelor's degree in Computer Science, Information Systems, or Data Engineering.
6+ years of hands-on data engineering experience.
Strong expertise in AWS Glue (Spark-based), PySpark, and Python (PEP 8).
Experience building large-scale ETL pipelines on AWS using S3, Glue, MWAA, EMR, Lambda, and SNS/SQS.
Experience with Apache Iceberg, Parquet, ORC, Avro, and multi-zone data lake architectures.
Experience with PostgreSQL, Redshift, Oracle, and NoSQL/vector stores.
Experience with Trino, Athena, and Hive for semantic layer development.
Proficiency with CloudFormation, GitHub workflows, and CI/CD pipelines.
Ability to produce complete technical ETL documentation.
Familiarity with FISMA, NIST 800-53, and OWASP ASVS Level 2.
Experience in agile federal environments.

Design and maintain data retrieval processes for various data sources (APIs, SFTP, etc.).
Build ingestion pipelines using AWS Glue, Airflow (MWAA), EMR, Lambda, and Step Functions.
Parse and process large-volume XML filings using PySpark and Apache Iceberg.
Implement transactional loading and prevent duplicate data loads across S3 and databases.
Integrate ETL Common Library for standardized orchestration and metadata recording.
Develop and maintain semantic layers with Trino/Athena and materialized views.
Deploy ETL resources using CloudFormation and agency CI/CD pipelines.
Produce full documentation suite including data models and mapping documents.
Achieve 90% automated test coverage and adhere to OWASP security standards.
Participate in agile ceremonies, including PI planning and 2-week sprints.

View Full Description & ApplyYou'll be redirected to the employer's site