Principal DevOps Engineer
Z
Zeta GlobalMarTech, AdTech
Remote - United StatesFull-TimePrincipal
Salary180000 - 210000 USD per year
Apply NowOpens the employer's application page
Job Details
- Experience
- 10+ years
- Required Skills
- AWSDockerNode.jsPythonDynamoDBJavaKubernetesRubyApache KafkaGrafanaPrometheusReactCI/CDTerraformHelm
Requirements
- 10+ years of progressive experience in DevOps, SRE, Platform Engineering, or Infrastructure Engineering roles, with demonstrated impact at staff or principal level.
- Expert-level Kubernetes knowledge, including cluster administration, Helm chart authoring, custom controllers/operators, network policies, RBAC, and multi-cluster management on AWS EKS.
- Deep expertise in CI/CD pipeline architecture and advanced deployment strategies (canary, blue/green, progressive delivery, feature flag integration) at scale.
- Strong proficiency with Infrastructure as Code using Terraform, including module design, state management, and multi-environment orchestration.
- Expert knowledge of Docker containerization, including multi-stage builds, security hardening, image optimization, and container runtime management.
- Production experience with Apache Kafka, including cluster management, topic design, consumer group strategies, and operational monitoring for high-throughput streaming workloads.
- Strong networking fundamentals: DNS (Route 53, internal DNS), TCP/IP, routing, API Gateway, load balancing (ALB/NLB), service mesh, VPC peering, transit gateways, and network troubleshooting.
- Extensive AWS experience spanning EKS, EC2, SQS, DynamoDB, IAM, VPC, CloudWatch, and related services in production environments.
- Hands-on experience with observability platforms: Grafana (dashboards, alerting), Prometheus (metrics, PromQL), Loki (log aggregation), and Honeycomb (distributed tracing, BubbleUp analysis).
- Working familiarity with multiple language stacks including Node.js, React, Python, Java, and Ruby, sufficient to understand build systems, dependency management, and runtime characteristics.
- Experience operating within regulated environments, with practical knowledge of GDPR, CCPA, SOC 2, and compliance automation in MarTech or AdTech domains.
- Proven ability to influence engineering culture, drive adoption of new practices, and communicate complex technical strategies clearly to both technical and non-technical stakeholders.
- Demonstrated experience with GitLab CI/CD pipelines, including advanced pipeline features such as parent-child pipelines, dynamic environments, and security scanning integration.
Responsibilities
- Design, build, and operate production-grade CI/CD pipelines enabling multiple developers on multiple teams to deploy concurrently to production, multiple times daily, with zero-downtime guarantees.
- Implement and optimize advanced deployment strategies including canary releases, blue/green deployments, rolling updates, incremental rollouts, and feature flag-gated releases via Statsig.
- Build self-service deployment tooling that empowers developers to own their release process while enforcing safety guardrails, automated rollback triggers, and automate compliance gates.
- Establish deployment observability with real-time canary analysis, automated health scoring, and progressive delivery metrics integrated with Grafana, Prometheus, and Honeycomb.
- Define and enforce SLOs/SLIs/SLAs across services, establishing error budgets that balance velocity with reliability.
- Lead incident response processes, including on-call rotations, runbook development, blameless postmortems, and incident command structure.
- Manage and optimize AWS infrastructure spanning EC2, SQS, DynamoDB, and related services with Infrastructure as Code (Terraform) best practices.
- Design and operate Kafka-based event streaming infrastructure for high-throughput, low-latency data pipelines.
- Embed compliance controls directly into CI/CD pipelines, ensuring automated enforcement of GDPR, CCPA, and SOC 2 requirements.
- Serve as a technical leader and DevOps disruptor, challenging legacy processes and introducing modern practices that dramatically improve developer velocity and operational safety.
View Full Description & ApplyYou'll be redirected to the employer's site