AI Data Engineer

New
C
C the SignsHealthcare AI
Boston, Massachusetts, United States. New York, New York, United States. New York, United States. New Jersey, United States. New Hampshire, United States. Rhode Island, United StatesFull-TimeMiddle
Salary not disclosed
Apply NowOpens the employer's application page

Job Details

Required Skills
AWSPythonETLGCPJavaAzureSparkScalaData modeling

Requirements

  • Bachelor's degree in Computer Science, Engineering, or a related field.
  • Proven experience as a Data Engineer, with a focus on big data technologies.
  • Strong proficiency in programming languages such as Python, Scala, or Java.
  • Extensive experience with data warehousing, ETL processes, and data modeling.
  • Experience with major cloud providers (e.g., AWS, GCP, Azure) and their data storage and processing services.
  • Hands-on experience with big data frameworks like Apache Spark for distributed processing.
  • Excellent problem-solving skills and the ability to work independently and as part of a team.
  • Strong communication and interpersonal skills.
  • Master's degree in a related field preferred.
  • Experience with healthcare data and healthcare data standards (e.g., FHIR, HL7) preferred.
  • Familiarity with machine learning concepts and LLM fine-tuning processes preferred.
  • Experience with data orchestration tools (e.g., Apache Airflow) preferred.

Responsibilities

  • Collaborate with data scientists and machine learning engineers to understand data requirements for LLM and machine learning model fine-tuning.
  • Design, build, and maintain scalable data pipelines to ingest, process, and store massive and diverse healthcare datasets.
  • Implement robust data validation and monitoring to ensure the integrity, accuracy, and consistency of all training datasets.
  • Implement robust data cleaning, validation, and transformation processes to ensure data quality and integrity.
  • Develop and optimize data structures and schemas for efficient access and utilization by LLMs and machine learning models.
  • Work with the team to identify and acquire new data sources, ensuring compliance with relevant healthcare regulations (e.g., HIPAA).
  • Monitor data pipeline performance, troubleshoot issues, and implement optimizations to improve efficiency and reliability.
  • Document data engineering processes, data models, and data dictionaries.
View Full Description & ApplyYou'll be redirected to the employer's site
View details
Apply Now