Staff Data Engineer - Semantic Data Layer
Z
Zeta GlobalMarTech/AdTech
Remote - United StatesFull-TimeStaff
Salary170000 - 200000 USD per year
Apply NowOpens the employer's application page
Job Details
- Experience
- 10+ years of experience in data engineering, data architecture, or platform engineering, with at least 3 years operating at a Staff/Principal level.
- Required Skills
- GraphQLPostgreSQLSQLDynamoDBMongoDBMySQLSnowflakeApache KafkaRESTful APIsBigQueryRedshift
Requirements
- 10+ years of experience in data engineering, data architecture, or platform engineering
- 3+ years operating at a Staff/Principal level
- Deep hands-on expertise with relational databases (MySQL/PostgreSQL)
- Deep hands-on expertise with NoSQL databases (DynamoDB, Aerospike, MongoDB)
- Deep hands-on expertise with cloud data warehouses (Snowflake, BigQuery, Redshift)
- Deep hands-on expertise with data lakes (S3, Delta Lake, Iceberg)
- Strong experience with streaming and messaging systems (Apache Kafka, Amazon SQS/SNS, Kinesis)
- Proven experience building or operating semantic/metrics layers (Cube.js/Cube Core, dbt Metrics, LookML)
- Expert-level SQL skills and experience with query optimization across distributed systems
- Production experience designing multi-tenant data platforms with strict security and isolation requirements
- Strong understanding of data governance, access control models (RBAC, ABAC)
- Strong understanding of compliance frameworks (SOC 2, GDPR, CCPA)
- Experience designing and exposing APIs (REST, GraphQL) for data consumption at scale
- BS/MS in Computer Science, Data Engineering, or equivalent practical experience
- Experience building data interfaces specifically for AI/ML consumption (tool-use APIs for LLM agents, MCP, function-calling patterns)
- Familiarity with AI orchestration frameworks (LangChain, LlamaIndex, Semantic Kernel)
- Experience with infrastructure-as-code (Terraform, Pulumi), container orchestration (Kubernetes, ECS), and CI/CD pipelines
- Background in MarTech/AdTech data domains (identity graphs, audience segmentation, campaign analytics, attribution modeling, real-time bidding data)
Responsibilities
- Design and build a centralized semantic data layer using Cube Core (or equivalent technology) that provides a unified, governed abstraction over all company data sources.
- Define semantic models, metrics, dimensions, and relationships that map to business domains.
- Expose the semantic layer via REST/GraphQL APIs and MCP-compatible tool interfaces for consumption by AI agents and LLMs.
- Integrate and unify data from heterogeneous systems including MySQL, DynamoDB, Aerospike, Snowflake, Amazon S3, Apache Kafka, Amazon SQS.
- Build connectors, adapters, and federation layers to query across operational (OLTP) and analytical (OLAP) data sources.
- Design tool interfaces and API contracts that allow AI agents to discover data, understand schema semantics, and generate accurate queries autonomously.
- Collaborate with AI/ML teams to optimize the semantic layer for LLM-generated SQL, natural language querying, RAG, and agentic workflows.
- Architect the semantic layer with native multi-tenant isolation, ensuring strict data segregation and tenant-scoped access controls.
- Implement row-level security, column-level masking, and attribute-based access controls (ABAC) to enforce data governance policies.
- Design for horizontal scalability to support thousands of concurrent queries from AI agents, internal dashboards, and customer-facing products.
- Serve as the technical authority on data architecture decisions, authoring ADRs and reference architectures.
- Mentor and guide senior engineers on best practices for semantic modeling, data governance, and API design.
View Full Description & ApplyYou'll be redirected to the employer's site