Staff Data Engineer

A
ArineHealthcare technology
Remote (United States of America)Full-TimeStaff
Salary170,000 - 185,000 USD per year
Apply NowOpens the employer's application page

Job Details

Experience
10+ years working in data engineering, with a focus on large-scale data ingestion and infrastructure
Required Skills
AWSDockerPythonKubernetesEHRHIPAA

Requirements

  • 10+ years working in data engineering, with a focus on large-scale data ingestion and infrastructure
  • A track record of building automated, production-grade ETL processes using Python and DBT SQL
  • Strong understanding of ETL/ELT frameworks and distributed data processing
  • Demonstrated hands-on experience building software with AI coding tools - not just autocomplete, but directing AI agents to generate complete solutions and applying disciplined review and ownership of the output
  • A genuine conviction that AI-augmented development is the future of software engineering, paired with the judgment to validate, test, and take accountability for AI-generated code
  • Experience or strong interest in integrating LLMs into engineering workflows beyond development assistance - such as automating data quality checks, generating pipeline logic, or surfacing anomalies
  • Proven ability to handle and process varied file types and formats, including healthcare standards such as HL7, 834, 837, and NCPDP
  • Demonstrated success integrating and consolidating data from diverse source systems into a unified repository, including EHR and claims systems, via both file-based and API integrations
  • Comfort working with large-scale datasets (10GB+), with strong capability implementing incremental processing and change data capture (CDC) methodologies
  • Extensive background designing scalable data architectures in AWS environments
  • Solid grounding in software engineering principles, including test-driven development, loose coupling, single responsibility, and modular design
  • Hands-on familiarity with containerization (Docker, Kubernetes) and proven ability to build configuration-driven systems that diverse engineering profiles can operate without code changes
  • A passion for building new data infrastructure and continuously improving existing systems with robustness, maintainability, and operational excellence

Responsibilities

  • Act as the team architect by leading system design reviews, offering recommendations, conducting comprehensive peer reviews, and demonstrating expert-level proficiency in Python and AWS services
  • Architecting and implementing scalable data ingestion pipelines, including incremental ingestion strategies for large-scale healthcare datasets
  • Developing reusable, configuration-driven, containerized pipeline components and toolsets that diverse engineering profiles can use and maintain
  • Work collaboratively with cross-functional teams to ensure their data requirements are met through ETL components
  • Designing and maintaining data transformation pipelines using dbt, including utilizing core concepts like macros, incremental models and dbt tests
  • Building monitoring and alerting systems for data ingestion processes and pipeline health
  • Applying software engineering best practices including test-driven development and modular design to data infrastructure, including refactoring existing ingestion processes to improve scalability and operational efficiency
  • Provide technical guidance, mentorship to junior engineers, and promote best practices and coding standards
  • Champion AI-assisted development across the team - establishing norms, workflows, and expectations for using AI coding tools (e.g., Claude Code, Cursor, Copilot) to generate, iterate, and ship production-quality code
  • Model the “builder to reviewer” shift - demonstrating how senior engineers direct AI agents to produce full solutions, then apply rigorous review, testing, and judgment to own the output
View Full Description & ApplyYou'll be redirected to the employer's site
170,000 - 185,000 USD per year
Apply Now