Senior DevOps Data Engineer
Inactive
E
EverOpsDevOps
U.S.-BasedFull-TimeSenior
Salary not disclosed
Apply NowOpens the employer's application page
Job Details
- Experience
- 5+ years
- Required Skills
- AWSPostgreSQLPythonBashDynamoDBGCPJenkinsKafkaKubernetesMongoDBMySQLAzureCassandraGoGrafanaPrometheusRustLinuxTerraformRedshiftGitHub ActionsDatadogCloudFormation
Requirements
- 5+ years of professional experience as a DevOps Engineer, Data Platform Engineer, Database Reliability Engineer, or Site Reliability Engineer with a data infrastructure focus
- Deep hands-on experience designing and operating disaster recovery architectures for production databases (failover, replication, backup/restore, cross-region DR)
- Production experience planning and executing database cutover workflows—blue-green database swaps, read-replica promotions, DMS-based migrations, and zero-downtime schema changes
- Strong experience with AWS managed data services: RDS/Aurora (Multi-AZ, Global Database, cross-region replicas), DynamoDB (Global Tables, PITR, on-demand backup), ElastiCache, Redshift, and/or MSK
- Hands-on experience with Infrastructure as Code (Terraform + Atlantis and/or CloudFormation) for data platform provisioning and lifecycle management
- Hands-on experience and deep understanding of Linux
- Strong professional experience with at least one of: Python, Golang, Bash, or Rust for automation and tooling
- Production experience with Amazon EKS including understanding of how data workloads intersect with Kubernetes (StatefulSets, PVCs, External Secrets Operator, connection pooling)
- Experience with HashiCorp Vault for secrets management, particularly database credential rotation and dynamic secrets
- Understanding of GitOps workflows, repository structures, and governance patterns
- Experience with CI/CD tools like Jenkins, GitHub Actions, ArgoCD, etc.
- Experience with monitoring tools such as Datadog, Splunk, ELK, or Prometheus/Grafana—specifically for data infrastructure observability
- Relational database experience with PostgreSQL or MySQL including operational knowledge of replication, failover, and performance tuning
- NoSQL experience with at least one of: DynamoDB, Cassandra, or MongoDB including understanding of consistency models and partition strategies
Responsibilities
- Design, implement, and validate disaster recovery architectures for relational, NoSQL, and managed data services across AWS, Azure, or GCP
- Plan and execute database migration cutovers including blue-green database swaps, read-replica promotion, and zero-downtime schema migration workflows
- Architect replication topologies (cross-region, cross-account, active-passive, active-active) and validate RPO/RTO targets through runbook-driven DR drills
- Build and maintain Infrastructure as Code for data platform provisioning (RDS, Aurora, DynamoDB, ElastiCache, Redshift, managed Kafka/MSK, etc.) using Terraform, Atlantis, and/or CloudFormation
- Design backup, snapshot, and point-in-time recovery strategies with automated validation and alerting
- Develop automation tooling for data platform operations: failover orchestration, health checks, capacity scaling, and credential rotation
- Implement observability for data infrastructure—replication lag monitoring, connection pool health, query performance baselines, and storage growth forecasting
- Support production workload migrations including data tier cutovers with rollback plans and data integrity verification
- Contribute to multi-tenant Kubernetes platform operations where data services intersect
- Participate in regular customer and internal EverOps scrums, providing data architecture guidance and operational status
- Document runbooks, architecture decision records (ADRs), and operational playbooks for data platform operations
View Full Description & ApplyYou'll be redirected to the employer's site