Staff Engineer, Site Reliability
New
B
BabylistE-commerce Tech
Babylist is remote-first with team members across the U.S. and CanadaFull-TimeStaff
Salary226,673 - 271,991 USD per year
Apply NowOpens the employer's application page
Job Details
- Required Skills
- AWSKubernetesCI/CDTerraformDatadog
Requirements
- Deep hands-on Terraform expertise.
- Proven AWS experience at scale including EKS, RDS, cloud networking, DNS, CDNs, and load balancers.
- Experienced operating Kubernetes in production.
- Experience designing and improving CI/CD systems like CircleCI or GitHub Actions.
- Strong observability instincts with tools such as Datadog, Sentry, PagerDuty, or Cronitor.
- Experienced with on-call rotations and incident management.
- Comfortable supporting developers across local, staging, and production.
- Demonstrated habit of using AI in daily engineering workflows.
Responsibilities
- Manage and evolve AWS environment using Terraform, keeping EKS clusters, databases, and core services current and performant.
- Own the speed and reliability of CI systems for the engineering organization.
- Support developers across local, staging, and production environments.
- Establish and socialize monitoring and alerting best practices.
- Lead or support incident response and drive post-incident reviews.
- Contribute to architectural decisions shaping future infrastructure.
View Full Description & ApplyYou'll be redirected to the employer's site