Staff Data Engineer - Semantic Data Layer

Zeta GlobalMarTech/AdTech

Remote - United StatesFull-TimeStaff

Salary170000 - 200000 USD per year

Apply NowOpens the employer's application page

Job Details

Experience: 10+ years of experience in data engineering, data architecture, or platform engineering, with at least 3 years operating at a Staff/Principal level.
Required Skills: GraphQLPostgreSQLSQLDynamoDBMongoDBMySQLSnowflakeApache KafkaRESTful APIsBigQueryRedshift

Requirements

10+ years of experience in data engineering, data architecture, or platform engineering
3+ years operating at a Staff/Principal level
Deep hands-on expertise with relational databases (MySQL/PostgreSQL)
Deep hands-on expertise with NoSQL databases (DynamoDB, Aerospike, MongoDB)
Deep hands-on expertise with cloud data warehouses (Snowflake, BigQuery, Redshift)
Deep hands-on expertise with data lakes (S3, Delta Lake, Iceberg)
Strong experience with streaming and messaging systems (Apache Kafka, Amazon SQS/SNS, Kinesis)
Proven experience building or operating semantic/metrics layers (Cube.js/Cube Core, dbt Metrics, LookML)
Expert-level SQL skills and experience with query optimization across distributed systems
Production experience designing multi-tenant data platforms with strict security and isolation requirements
Strong understanding of data governance, access control models (RBAC, ABAC)
Strong understanding of compliance frameworks (SOC 2, GDPR, CCPA)
Experience designing and exposing APIs (REST, GraphQL) for data consumption at scale
BS/MS in Computer Science, Data Engineering, or equivalent practical experience
Experience building data interfaces specifically for AI/ML consumption (tool-use APIs for LLM agents, MCP, function-calling patterns)
Familiarity with AI orchestration frameworks (LangChain, LlamaIndex, Semantic Kernel)
Experience with infrastructure-as-code (Terraform, Pulumi), container orchestration (Kubernetes, ECS), and CI/CD pipelines
Background in MarTech/AdTech data domains (identity graphs, audience segmentation, campaign analytics, attribution modeling, real-time bidding data)

Responsibilities

Design and build a centralized semantic data layer using Cube Core (or equivalent technology) that provides a unified, governed abstraction over all company data sources.
Define semantic models, metrics, dimensions, and relationships that map to business domains.
Expose the semantic layer via REST/GraphQL APIs and MCP-compatible tool interfaces for consumption by AI agents and LLMs.
Integrate and unify data from heterogeneous systems including MySQL, DynamoDB, Aerospike, Snowflake, Amazon S3, Apache Kafka, Amazon SQS.
Build connectors, adapters, and federation layers to query across operational (OLTP) and analytical (OLAP) data sources.
Design tool interfaces and API contracts that allow AI agents to discover data, understand schema semantics, and generate accurate queries autonomously.
Collaborate with AI/ML teams to optimize the semantic layer for LLM-generated SQL, natural language querying, RAG, and agentic workflows.
Architect the semantic layer with native multi-tenant isolation, ensuring strict data segregation and tenant-scoped access controls.
Implement row-level security, column-level masking, and attribute-based access controls (ABAC) to enforce data governance policies.
Design for horizontal scalability to support thousands of concurrent queries from AI agents, internal dashboards, and customer-facing products.
Serve as the technical authority on data architecture decisions, authoring ADRs and reference architectures.
Mentor and guide senior engineers on best practices for semantic modeling, data governance, and API design.

View Full Description & ApplyYou'll be redirected to the employer's site