Senior Data Engineer

New
C
Ceresti HealthHealth Tech
US-based candidates onlyFull-TimeSenior
Salary not disclosed
Apply NowOpens the employer's application page

Job Details

Experience
8+ years
Required Skills
AWSPostgreSQLPythonSQLAirflowdbtHIPAA

Requirements

  • BS/BA degree or higher in Computer Science, Engineering, or a related technical field
  • 8+ years of professional data engineering experience
  • Mastery of PostgreSQL
  • Experience with file-based and API-based ingestion
  • Hands-on experience with cloud platforms (AWS preferred)
  • Experience with data warehouses and data lakes
  • Strong experience with dbt or equivalent SQL-based transformation framework
  • Experience with at least one orchestration framework (Dagster, Prefect, or Airflow)
  • Strong Python skills
  • Experience with data validation and quality frameworks
  • Experience with HIPAA-regulated environments
  • Comfortable with infrastructure-as-code and CI/CD
  • Experience supporting ML workloads
  • Experience using AI coding assistants
  • Excellent written and verbal communication skills
  • Experience working in Agile/Scrum teams

Responsibilities

  • Design and own Ceresti’s end-to-end data architecture: a landing zone with secure cloud object storage for raw partner files and API payloads, validated ingestion pipelines into our transactional Postgres, and a curated analytics layer that decouples reporting and AI workloads from production
  • Build ingestion pipelines for the data we receive today, including partner data files (CSV/JSON/XML/HL7/X12 as applicable) and REST/SFTP API integrations with schema validation, quarantine of bad records, and full lineage from raw bytes to curated row
  • Stand up and operate the curated layer (data warehouse / lakehouse-lite) so analytics and ML models can consume data without slowing down the transactional system
  • Choose, integrate, and operate the smallest set of tools needed, including object storage, an orchestrator (Dagster, Prefect, Airflow, etc.), dbt or similar for transformations, a single validation library (Great Expectations / Pandera / Soda)
  • Design and enforce data governance for a HIPAA-regulated environment: PHI/PII classification, encryption in transit and at rest, role-based access, audit logging, retention and minimum-necessary policies, and de-identification where appropriate
  • Partner with backend, ML, product, and clinical stakeholders to define data contracts with our health plan and ACO partners and hold the line on data quality
  • Build and maintain reliable feature data for ML models, including embeddings (e.g., pgvector) and curated feature tables for risk stratification, engagement, and outcomes work
  • Instrument the data platform for observability including pipeline SLAs, data freshness, schema drift, quality metrics, and act on what the data tells you
  • Participate fully in our Agile process: backlog grooming, sprint planning, demos, and retrospectives
  • Mentor engineers across the team on SQL, schema design, and the craft of building data systems that are boring in the best possible way
View Full Description & ApplyYou'll be redirected to the employer's site
View details
Apply Now