Senior DevOps Data Engineer

Inactive

EverOpsDevOps

U.S.-BasedFull-TimeSenior

Salary not disclosed

Apply NowOpens the employer's application page

Job Details

Experience: 5+ years
Required Skills: AWSPostgreSQLPythonBashDynamoDBGCPJenkinsKafkaKubernetesMongoDBMySQLAzureCassandraGoGrafanaPrometheusRustLinuxTerraformRedshiftGitHub ActionsDatadogCloudFormation

Requirements

5+ years of professional experience as a DevOps Engineer, Data Platform Engineer, Database Reliability Engineer, or Site Reliability Engineer with a data infrastructure focus
Deep hands-on experience designing and operating disaster recovery architectures for production databases (failover, replication, backup/restore, cross-region DR)
Production experience planning and executing database cutover workflows—blue-green database swaps, read-replica promotions, DMS-based migrations, and zero-downtime schema changes
Strong experience with AWS managed data services: RDS/Aurora (Multi-AZ, Global Database, cross-region replicas), DynamoDB (Global Tables, PITR, on-demand backup), ElastiCache, Redshift, and/or MSK
Hands-on experience with Infrastructure as Code (Terraform + Atlantis and/or CloudFormation) for data platform provisioning and lifecycle management
Hands-on experience and deep understanding of Linux
Strong professional experience with at least one of: Python, Golang, Bash, or Rust for automation and tooling
Production experience with Amazon EKS including understanding of how data workloads intersect with Kubernetes (StatefulSets, PVCs, External Secrets Operator, connection pooling)
Experience with HashiCorp Vault for secrets management, particularly database credential rotation and dynamic secrets
Understanding of GitOps workflows, repository structures, and governance patterns
Experience with CI/CD tools like Jenkins, GitHub Actions, ArgoCD, etc.
Experience with monitoring tools such as Datadog, Splunk, ELK, or Prometheus/Grafana—specifically for data infrastructure observability
Relational database experience with PostgreSQL or MySQL including operational knowledge of replication, failover, and performance tuning
NoSQL experience with at least one of: DynamoDB, Cassandra, or MongoDB including understanding of consistency models and partition strategies

Responsibilities

Design, implement, and validate disaster recovery architectures for relational, NoSQL, and managed data services across AWS, Azure, or GCP
Plan and execute database migration cutovers including blue-green database swaps, read-replica promotion, and zero-downtime schema migration workflows
Architect replication topologies (cross-region, cross-account, active-passive, active-active) and validate RPO/RTO targets through runbook-driven DR drills
Build and maintain Infrastructure as Code for data platform provisioning (RDS, Aurora, DynamoDB, ElastiCache, Redshift, managed Kafka/MSK, etc.) using Terraform, Atlantis, and/or CloudFormation
Design backup, snapshot, and point-in-time recovery strategies with automated validation and alerting
Develop automation tooling for data platform operations: failover orchestration, health checks, capacity scaling, and credential rotation
Implement observability for data infrastructure—replication lag monitoring, connection pool health, query performance baselines, and storage growth forecasting
Support production workload migrations including data tier cutovers with rollback plans and data integrity verification
Contribute to multi-tenant Kubernetes platform operations where data services intersect
Participate in regular customer and internal EverOps scrums, providing data architecture guidance and operational status
Document runbooks, architecture decision records (ADRs), and operational playbooks for data platform operations

View Full Description & ApplyYou'll be redirected to the employer's site