Job Details
- Required Skills
- AWSPythonKubernetesClickhouseCI/CDTerraformGitHub Actions
Requirements
- Strong Terraform experience (production-level)
- Solid AWS infrastructure experience
- Kubernetes / EKS administration and operations
- Containers and cloud-native infrastructure
- SRE mindset and operational judgment
- Ability to understand systems under the hood
- Python for automation (nice-to-have)
- Database administration experience (nice-to-have)
- Experience with ClickHouse, Redis, LiteLLM, or Langfuse (nice-to-have)
- Experience with Observability tools such as Prometheus or Grafana (nice-to-have)
Responsibilities
- Deploy and maintain infrastructure using Terraform on AWS.
- Work within corporate golden path leveraging HCP and HashiCorp Vault.
- Operate and govern production-grade platforms running on Kubernetes / EKS.
- Manage platforms such as Langfuse, LiteLLM, ClickHouse, Redis, and future Data & AI tooling.
- Design and maintain backup, HA, scaling, and upgrade strategies.
- Troubleshoot production incidents and improve operational recovery.
- Build and maintain CI/CD pipelines using GitHub Actions.
- Create operational runbooks and reduce manual toil through automation.
View Full Description & ApplyYou'll be redirected to the employer's site