Staff Data Engineer

ArineHealthcare technology

Remote (United States of America)Full-TimeStaff

Salary170,000 - 185,000 USD per year

Apply NowOpens the employer's application page

Job Details

Experience: 10+ years working in data engineering, with a focus on large-scale data ingestion and infrastructure
Required Skills: AWSDockerPythonKubernetesEHRHIPAA

Requirements

10+ years working in data engineering, with a focus on large-scale data ingestion and infrastructure
A track record of building automated, production-grade ETL processes using Python and DBT SQL
Strong understanding of ETL/ELT frameworks and distributed data processing
Demonstrated hands-on experience building software with AI coding tools - not just autocomplete, but directing AI agents to generate complete solutions and applying disciplined review and ownership of the output
A genuine conviction that AI-augmented development is the future of software engineering, paired with the judgment to validate, test, and take accountability for AI-generated code
Experience or strong interest in integrating LLMs into engineering workflows beyond development assistance - such as automating data quality checks, generating pipeline logic, or surfacing anomalies
Proven ability to handle and process varied file types and formats, including healthcare standards such as HL7, 834, 837, and NCPDP
Demonstrated success integrating and consolidating data from diverse source systems into a unified repository, including EHR and claims systems, via both file-based and API integrations
Comfort working with large-scale datasets (10GB+), with strong capability implementing incremental processing and change data capture (CDC) methodologies
Extensive background designing scalable data architectures in AWS environments
Solid grounding in software engineering principles, including test-driven development, loose coupling, single responsibility, and modular design
Hands-on familiarity with containerization (Docker, Kubernetes) and proven ability to build configuration-driven systems that diverse engineering profiles can operate without code changes
A passion for building new data infrastructure and continuously improving existing systems with robustness, maintainability, and operational excellence

Responsibilities

Act as the team architect by leading system design reviews, offering recommendations, conducting comprehensive peer reviews, and demonstrating expert-level proficiency in Python and AWS services
Architecting and implementing scalable data ingestion pipelines, including incremental ingestion strategies for large-scale healthcare datasets
Developing reusable, configuration-driven, containerized pipeline components and toolsets that diverse engineering profiles can use and maintain
Work collaboratively with cross-functional teams to ensure their data requirements are met through ETL components
Designing and maintaining data transformation pipelines using dbt, including utilizing core concepts like macros, incremental models and dbt tests
Building monitoring and alerting systems for data ingestion processes and pipeline health
Applying software engineering best practices including test-driven development and modular design to data infrastructure, including refactoring existing ingestion processes to improve scalability and operational efficiency
Provide technical guidance, mentorship to junior engineers, and promote best practices and coding standards
Champion AI-assisted development across the team - establishing norms, workflows, and expectations for using AI coding tools (e.g., Claude Code, Cursor, Copilot) to generate, iterate, and ship production-quality code
Model the “builder to reviewer” shift - demonstrating how senior engineers direct AI agents to produce full solutions, then apply rigorous review, testing, and judgment to own the output

View Full Description & ApplyYou'll be redirected to the employer's site