Senior Infrastructure Engineer
New
Remote-first work arrangement within the United StatesFull-TimeSenior
SalaryCompetitive base salary with performance-based bonus and equity opportunities
Apply NowOpens the employer's application page
Job Details
- Experience
- 5+ years of experience working with AWS cloud infrastructure in production environments; 3+ years of experience with Kubernetes and container orchestration technologies.
- Required Skills
- AWSDockerPythonKubernetesGoPostgresCI/CDTerraformHelm
Requirements
- Bachelor’s degree in Computer Science, Computer Engineering, or a related field.
- 5+ years of experience working with AWS cloud infrastructure in production environments.
- 3+ years of experience with Kubernetes and container orchestration technologies.
- Strong programming skills in Python and Go, with experience writing production-grade code.
- Hands-on experience with infrastructure-as-code tools such as Terraform, Helm, or similar.
- Experience with Docker and containerized application environments.
- Knowledge of cloud or on-prem storage systems and distributed infrastructure patterns.
- Experience supporting production systems such as Kafka, Zookeeper, and Postgres.
- Familiarity with GitOps workflows, CI/CD pipelines, and deployment strategies.
- Strong understanding of observability, monitoring, and incident response practices.
- Ability to mentor junior engineers and contribute to engineering best practices.
Responsibilities
- Design, build, and maintain scalable cloud infrastructure while improving automation, reliability, and operational efficiency.
- Develop and enhance internal infrastructure tools and automation frameworks.
- Design, deploy, and manage AWS-based infrastructure using services such as EC2, RDS, CloudFormation, and ElastiCache.
- Build and maintain Kubernetes clusters, including workload orchestration, scaling, and lifecycle management.
- Implement infrastructure-as-code solutions using tools such as Terraform and Helm.
- Drive observability and monitoring strategies using tools like CloudWatch, New Relic, or Nagios.
- Collaborate with development and operations teams to improve system reliability, performance, and deployment processes.
- Lead code reviews and ensure infrastructure changes meet security and reliability standards.
- Design highly available systems across multiple availability zones.
- Document architectural decisions and operational processes for cross-team alignment.
View Full Description & ApplyYou'll be redirected to the employer's site