Staff Data Engineer
A
ArineHealthcare technology
Remote (United States of America)Full-TimeStaff
Salary170,000 - 185,000 USD per year
Apply NowOpens the employer's application page
Job Details
- Experience
- 10+ years working in data engineering, with a focus on large-scale data ingestion and infrastructure
- Required Skills
- AWSDockerPythonKubernetesEHRHIPAA
Requirements
- 10+ years working in data engineering, with a focus on large-scale data ingestion and infrastructure
- A track record of building automated, production-grade ETL processes using Python and DBT SQL
- Strong understanding of ETL/ELT frameworks and distributed data processing
- Demonstrated hands-on experience building software with AI coding tools - not just autocomplete, but directing AI agents to generate complete solutions and applying disciplined review and ownership of the output
- A genuine conviction that AI-augmented development is the future of software engineering, paired with the judgment to validate, test, and take accountability for AI-generated code
- Experience or strong interest in integrating LLMs into engineering workflows beyond development assistance - such as automating data quality checks, generating pipeline logic, or surfacing anomalies
- Proven ability to handle and process varied file types and formats, including healthcare standards such as HL7, 834, 837, and NCPDP
- Demonstrated success integrating and consolidating data from diverse source systems into a unified repository, including EHR and claims systems, via both file-based and API integrations
- Comfort working with large-scale datasets (10GB+), with strong capability implementing incremental processing and change data capture (CDC) methodologies
- Extensive background designing scalable data architectures in AWS environments
- Solid grounding in software engineering principles, including test-driven development, loose coupling, single responsibility, and modular design
- Hands-on familiarity with containerization (Docker, Kubernetes) and proven ability to build configuration-driven systems that diverse engineering profiles can operate without code changes
- A passion for building new data infrastructure and continuously improving existing systems with robustness, maintainability, and operational excellence
Responsibilities
- Act as the team architect by leading system design reviews, offering recommendations, conducting comprehensive peer reviews, and demonstrating expert-level proficiency in Python and AWS services
- Architecting and implementing scalable data ingestion pipelines, including incremental ingestion strategies for large-scale healthcare datasets
- Developing reusable, configuration-driven, containerized pipeline components and toolsets that diverse engineering profiles can use and maintain
- Work collaboratively with cross-functional teams to ensure their data requirements are met through ETL components
- Designing and maintaining data transformation pipelines using dbt, including utilizing core concepts like macros, incremental models and dbt tests
- Building monitoring and alerting systems for data ingestion processes and pipeline health
- Applying software engineering best practices including test-driven development and modular design to data infrastructure, including refactoring existing ingestion processes to improve scalability and operational efficiency
- Provide technical guidance, mentorship to junior engineers, and promote best practices and coding standards
- Champion AI-assisted development across the team - establishing norms, workflows, and expectations for using AI coding tools (e.g., Claude Code, Cursor, Copilot) to generate, iterate, and ship production-quality code
- Model the “builder to reviewer” shift - demonstrating how senior engineers direct AI agents to produce full solutions, then apply rigorous review, testing, and judgment to own the output
View Full Description & ApplyYou'll be redirected to the employer's site