Data Integration Engineer

New
T
Thyme CareHealthcare
Remote - USFull-Time
Salary139,500 - 155,000 USD per year
Apply NowOpens the employer's application page

Job Details

Required Skills
PythonSQLData modelingdbtDatabricksGitHub ActionsDatadog

Requirements

  • Strong SQL skills
  • Familiarity with dbt (and an interest in ramping up your expertise), including working in larger/complex projects
  • Working knowledge of Python for data investigation in notebooks
  • Experience operating data pipelines: debugging failures, tracing issues across systems, and communicating clearly about root cause and mitigation
  • Experience with testing and data quality: writing and and maintaining tests and using failures/alerts to drive durable fixes
  • Responsiveness and the ability to stay calm and organized when triaging failing ingestion runs or pipelines
  • Willingness to learn new domains and tools quickly (new partner file formats, evolving standards, Databricks), and apply feedback without ego
  • The ability to engage technical and non-technical stakeholders to explain what’s happening in our pipelines and identify opportunities to improve transparency and alerting
  • Healthcare data exposure (claims/eligibility/ADT/etc.)

Responsibilities

  • Gain a deep understanding of our data platform and contribute to improving our data models and pipelines using SQL, dbt, and python (generally data-focused packages, e.g., pandas, polars)
  • Support ingestion of a wide range of healthcare-related sources (claims, eligibility, prior auth, ADT, etc.) by Configuring net-new ingestions (parsing file specs, validating assumptions, communicating inconsistencies), Debugging issues in ongoing ones, Helping standardize our processes and pipelines
  • Collaborate with data scientist deal owners and internal stakeholders to turn messy, ambiguous requirements into concrete mapping/validation logic and durable data contracts
  • Use Dagster and GitHub Actions to orchestrate and automate the early stages of our data pipelines, improving run reliability and reducing manual intervention
  • Work hands-on with raw data using Jupyter Notebooks in Databricks to investigate data issues, validate assumptions, and unblock processing
  • Design and support incremental data loads (append/merge/upsert patterns) and safe reprocessing (idempotent runs, late-arriving data, backfills)
  • Learn to use Datadog and PagerDuty to monitor pipelines, triage incidents during business hours, communicate impact clearly, and drive root-cause fixes to prevent recurrences
  • Contribute to a complex, self-hosted dbt monorepo: implement transformations, incremental models, tests, documentation, and conventions that scale across deals
View Full Description & ApplyYou'll be redirected to the employer's site
139,500 - 155,000 USD per year
Apply Now