Senior Research Data Engineer
Remote- CanadaFull-TimeSenior
Salary$159,100 - $176,700 a year
Apply NowOpens the employer's application page
Job Details
- Experience
- 5+ years
- Required Skills
- PythonSQLMachine LearningDatabricksPySpark
Requirements
- 5+ years building production data systems, with at least 2 supporting ML or AI workloads.
- Advanced Python, SQL, and PySpark/Databricks experience.
- Expertise in Databricks ecosystem: Delta Lake, Unity Catalog, Spark tuning, and MLflow.
- AI domain literacy: embeddings, tokenization, feature engineering, and point-in-time correctness.
- Experience wrangling unstructured (text, PDFs, logs) and structured data.
- Proficiency with data quality, filtering, near-duplicate detection, and synthetic data generation.
- Familiarity with pipeline orchestration tools like Airflow, Dagster, or Prefect.
- Experience handling regulated or sensitive data (HIPAA or equivalent).
- Strong written documentation skills and experience eliciting requirements from experts.
- Bachelor’s degree in computer science, data science, engineering, or related field.
Responsibilities
- Reverse-engineer data semantics by collaborating with product, clinical, and workflow experts.
- Bridge technical data semantics with researcher requirements to design and build the gold data layer.
- Curate datasets across modalities including unstructured content, rich metadata, and point-in-time features.
- Develop reusable, observable data transformation pipelines using Databricks/Spark.
- Automate data quality, filtering, and synthetic data generation pipelines.
- Maintain versioned dataset snapshots and define clean lineage for downstream AI research.
View Full Description & ApplyYou'll be redirected to the employer's site