Cloud Infrastructure Engineer

New
Based in the United StatesFull-TimeMiddle
Salary85,000 - 100,000 USD per year
Apply NowOpens the employer's application page

Job Details

Experience
2–5+ years
Required Skills
PythonKubernetesGrafanaPrometheusCI/CDLinuxTerraform

Requirements

  • 2–5+ years of experience in Cloud Infrastructure Engineering, DevOps, or Site Reliability Engineering roles.
  • Strong hands-on experience operating Kubernetes in production environments.
  • Proven experience building CI/CD pipelines and working with GitOps methodologies.
  • Solid experience with Infrastructure as Code tools such as Terraform or equivalent solutions.
  • Strong Linux administration and troubleshooting skills in production environments.
  • Proficiency in Python or another scripting language for automation and tooling.
  • Good understanding of networking concepts, Kubernetes security, and deployment strategies.
  • Experience with observability tools and performance monitoring solutions.
  • Familiarity with load testing, system tuning, and reliability engineering practices.
  • Strong collaboration and communication skills, with the ability to work across engineering teams.

Responsibilities

  • Design, deploy, and manage production-grade Kubernetes clusters, including networking policies, RBAC, workload scheduling, and cluster security configurations.
  • Build and maintain CI/CD pipelines using Infrastructure as Code and GitOps practices to ensure reliable and repeatable deployments.
  • Provision and automate cloud infrastructure using tools such as Terraform or similar IaC frameworks.
  • Develop and manage containerization workflows, including secure image building, versioning, and promotion across environments.
  • Implement and maintain observability stacks using tools such as Prometheus, Grafana, and OpenTelemetry to ensure system health and performance visibility.
  • Support performance optimization efforts including load testing, capacity planning, and system resilience validation.
  • Participate in incident response, root cause analysis, and ongoing reliability engineering improvements.
  • Manage and support stateful services such as databases, caching systems, and messaging platforms in production environments.
  • Maintain clear and comprehensive technical documentation covering architecture, operations, and recovery procedures.
View Full Description & ApplyYou'll be redirected to the employer's site
85,000 - 100,000 USD per year
Apply Now