Apply🧭 Full-Time
💸 145000.0 - 195000.0 USD per year
🔍 Software Development
🏢 Company: Cavnue👥 101-250💰 $130,000,000 Series A about 3 years agoInformation ServicesAutonomous VehiclesSoftware
- 5+ years of hands-on experience in infrastructure engineering, DevOps, or SRE roles, with a track record of operating production cloud environments at scale.
- Strong experience using Terraform for infrastructure provisioning and configuration management in cloud environments.
- Proficiency in multi-cloud operations – Google Cloud Platform (GCP) is highly preferred; experience with Amazon Web Services (AWS) and/or Microsoft Azure is also acceptable.
- Deep understanding of Kubernetes (required), including experience setting up and managing Kubernetes clusters, deploying containerized applications, and debugging cluster and networking issues.
- Ability to write clean, maintainable code for automation and tooling in Python and/or Golang.
- Familiarity with basic networking concepts and protocols (TCP/IP, DNS, load balancing, VLANs/VPCs, firewalls) and how they apply in cloud and hybrid environments.
- Willingness to take part in on-call rotations and proven skills in troubleshooting and resolving infrastructure incidents under pressure.
- Strong hands-on skills with Linux and command-line tools; you are comfortable using terminals and utilities (e.g. k9s for Kubernetes, tmux sessions, zsh or similar shells) to manage and debug systems efficiently.
- Knowledge of zero trust architecture principles and a habit of incorporating security best practices into infrastructure design (formal security certifications are not required).
- Excellent communication skills with the ability to work cross-functionally. You can collaborate in a fast-paced engineering organization, explain complex infrastructure concepts to team members, and contribute to a positive engineering culture.
- Design and implement cloud and edge infrastructure
- Use Terraform to provision and manage infrastructure resources consistently across multiple cloud providers (GCP preferred, with AWS/Azure as needed), enabling reproducible and auditable infrastructure changes.
- Deploy, administer, and optimize Kubernetes clusters for containerized workloads. Handle cluster upgrades, scaling, monitoring, and troubleshoot complex issues in production Kubernetes environments.
- Develop robust automation scripts and internal tools/services in Python and/or Golang to automate routine tasks, integrate systems, and improve operational efficiency across the infrastructure.
- Implement monitoring, logging, and alerting solutions to track system performance and reliability. Proactively tune systems and address bottlenecks to maintain smooth operation of critical services.
- Embed security best practices into the infrastructure, enforcing zero trust architecture principles (e.g. least privilege, identity-based access) to protect systems and data. Work closely with security teams to remediate vulnerabilities and ensure compliance with company policies.
- Participate in an on-call rotation during the team’s initial growth phase, quickly responding to infrastructure incidents and leading efforts to restore service and perform root cause analysis.
- Work closely with all teams to understand application needs and translate them into scalable infrastructure solutions. Communicate clearly across teams and document designs and processes for broad understanding.
- Stay up to date with emerging technologies and industry best practices in cloud infrastructure, DevOps, and platform engineering. Lead or contribute to infrastructure projects that enhance deployment speed, cost efficiency, and overall platform reliability.
Posted 3 days ago
Apply