Senior/Lead Data Engineer
New
T
TruelogicSoftware, AI
ColombiaFull-TimeLead
Salary not disclosed
Apply NowOpens the employer's application page
Job Details
- Required Skills
- PythonSQLGitSparkdbtDatabricksPySpark
Requirements
- Expertise in SQL and dimensional modeling methodologies, including medallion architecture, SCDs, and grain management.
- Proven ability to design idempotent pipelines utilizing incremental, checkpoint, and replaceWhere strategies.
- Extensive experience with production-grade Python engineering, including type hints, pytest, and ruff.
- Strong capability to diagnose and resolve failing Spark / PySpark jobs utilizing tools like Spark UI.
- Deep understanding of Delta Lake features such as MERGE, OPTIMIZE, Z-ORDER, and time travel.
- Hands-on expertise with dbt, including models, tests, and exposures.
- Experience authoring and deploying jobs using Databricks Asset Bundles (DAB) and operating within a Unity Catalog environment.
- Strong adherence to disciplined Git workflows, conventional commits, and strict documentation practices.
- Experience provisioning and utilizing Service Principals, GitHub environment secrets, and secret management tools like Azure Key Vault or Databricks secret scopes.
- Strong written technical communication skills for PR descriptions and runbooks.
- Experience leading technical initiatives, establishing architectural standards, and contributing to interview rubrics is preferred.
- Experience reading or modifying Azure Data Factory (ADF) pipelines and familiarity with Azure Data Lake storage is highly preferred.
Responsibilities
- Design and build robust, idempotent data pipelines from scratch utilizing a modern data stack.
- Design star and snowflake schemas, writing precise, grain-aware SQL to construct scalable data marts.
- Write production-grade, unit-tested Python code at the module level, adhering to strong engineering disciplines such as type hinting and testing.
- Build and test dbt models across staging, intermediate, and mart layers while managing overall project structure.
- Author and deploy jobs using Databricks Asset Bundles (DAB) following documented architectural patterns.
- Implement rigorous data quality checks at source, intermediate, and destination layers to prevent silent drops of nulls or duplicates.
- Maintain data governance through comprehensive dbt tests and strict documentation-at-merge-time discipline.
- Operate securely within a multi-repository architecture, utilizing service principals and ensuring zero personal credentials in production deployments.
- Own data pipelines end-to-end, making key technical design decisions and mentoring mid-level engineers through substantive code reviews.
- Define overarching technical direction across core data systems and act as a technical leader to unblock the team.
View Full Description & ApplyYou'll be redirected to the employer's site