Data Integration Engineer

New

Remote - USFull-Time

Salary139,500 - 155,000 USD per year

Apply NowOpens the employer's application page

Job Details

Strong SQL skills
Familiarity with dbt (and an interest in ramping up your expertise), including working in larger/complex projects
Working knowledge of Python for data investigation in notebooks
Experience operating data pipelines: debugging failures, tracing issues across systems, and communicating clearly about root cause and mitigation
Experience with testing and data quality: writing and and maintaining tests and using failures/alerts to drive durable fixes
Responsiveness and the ability to stay calm and organized when triaging failing ingestion runs or pipelines
Willingness to learn new domains and tools quickly (new partner file formats, evolving standards, Databricks), and apply feedback without ego
The ability to engage technical and non-technical stakeholders to explain what’s happening in our pipelines and identify opportunities to improve transparency and alerting
Healthcare data exposure (claims/eligibility/ADT/etc.)

Gain a deep understanding of our data platform and contribute to improving our data models and pipelines using SQL, dbt, and python (generally data-focused packages, e.g., pandas, polars)
Support ingestion of a wide range of healthcare-related sources (claims, eligibility, prior auth, ADT, etc.) by Configuring net-new ingestions (parsing file specs, validating assumptions, communicating inconsistencies), Debugging issues in ongoing ones, Helping standardize our processes and pipelines
Collaborate with data scientist deal owners and internal stakeholders to turn messy, ambiguous requirements into concrete mapping/validation logic and durable data contracts
Use Dagster and GitHub Actions to orchestrate and automate the early stages of our data pipelines, improving run reliability and reducing manual intervention
Work hands-on with raw data using Jupyter Notebooks in Databricks to investigate data issues, validate assumptions, and unblock processing
Design and support incremental data loads (append/merge/upsert patterns) and safe reprocessing (idempotent runs, late-arriving data, backfills)
Learn to use Datadog and PagerDuty to monitor pipelines, triage incidents during business hours, communicate impact clearly, and drive root-cause fixes to prevent recurrences
Contribute to a complex, self-hosted dbt monorepo: implement transformations, incremental models, tests, documentation, and conventions that scale across deals

View Full Description & ApplyYou'll be redirected to the employer's site