Staff Data Engineer

A
AbleHealthcare
Remote, LATAMFull-TimeStaff
Salary not disclosed
Apply NowOpens the employer's application page

Job Details

Languages
English
Experience
8+ years
Required Skills
AWSPythonKafkaSparkTerraformRedshiftDatabricksAWS LambdaCloudFormationPySpark

Requirements

  • 8+ years of data engineering experience
  • Deep hands-on expertise in Databricks (Delta Lake, Unity Catalog, DLT)
  • Strong proficiency with AWS data services (S3, Glue, Lambda, Kinesis, Redshift, Athena, IAM)
  • Advanced Python and PySpark/Spark development skills
  • Experience with streaming and event-driven architectures using Kafka (Amazon MSK or Confluent)
  • Proven ability to implement data governance frameworks
  • Strong understanding of data modeling for analytical and operational use cases
  • Experience with infrastructure-as-code (Terraform, CloudFormation, or CDK) and CI/CD pipelines
  • Familiarity with regulatory and compliance requirements (HIPAA, SOC 2, ISO 27001)
  • Excellent collaboration and communication skills
  • Bachelor's degree in Computer Science, Data Science, Engineering, or a related field

Responsibilities

  • Design, build, and operate a Databricks medallion lakehouse architecture (Bronze/Silver/Gold layers) using Delta Live Tables to support ingestion, transformation, and serving of clinical, behavioral, and operational data across a multi-country digital health platform
  • Architect and maintain scalable data pipelines on AWS (S3, Glue, Lambda, Kinesis, MSK/Kafka) that ingest data from diverse sources
  • Implement multi-country data isolation and governance leveraging Databricks Unity Catalog, enforcing data residency requirements and integrating policy-as-code consent enforcement
  • Partner with platform, compliance, and analytics teams to define and enforce data quality standards, lineage tracking, schema evolution strategies, and tamper-evident audit logging
  • Support clinical data interoperability by implementing and maintaining FHIR-to-OMOP mapping pipelines
  • Optimize data platform performance, cost, and reliability through partitioning strategies, compaction, caching, cluster sizing, and monitoring
  • Contribute to certification and compliance readiness (e.g., ISO 27001, SOC 2 Type 2) by maintaining documentation, change control processes, and validation artifacts
  • Collaborate on real-time and event-driven architectures integrating Kafka-based streaming with the medallion layers and workflow orchestration
View Full Description & ApplyYou'll be redirected to the employer's site
View details
Apply Now