Apply📍 Brazil, Chile, Colombia, Mexico, Peru, Argentina
🧭 Full-Time
🔍 Software Development
🏢 Company: CookUnity👥 501-1000💰 $47,000,000 about 2 years agoFood DeliveryFood and BeverageConsumer ApplicationsSubscription ServiceOrganic Food
- 7+ years in DevOps, SRE, or related roles in cloud-native environments, with at least 5 years of direct experience managing AWS infrastructure at scale
- Proficiency in deploying, managing, and troubleshooting Kubernetes clusters, especially AWS EKS, including networking, RBAC, and Helm.
- Advanced English Level
- Advanced hands-on experience with ArgoCD for GitOps-based Kubernetes deployments, including setup, configuration, and troubleshooting.
- Strong development and scripting skills in Kotlin, Python, and Bash, with the ability to build automation tools and integrate with APIs.
- Deep knowledge of CI/CD concepts and tools, with proven experience building and maintaining pipelines for cloud-native applications.
- Demonstrated ability to design and implement infrastructure as code using Terraform and/or AWS CloudFormation.
- Strong problem-solving skills, including root cause analysis and incident management in distributed, cloud-based systems,
- Excellent communication and collaboration abilities, working effectively across development, QA, and operations teams.
- Architect, deploy, and manage highly available and scalable infrastructure on AWS, leveraging services such as EC2, VPC, S3, IAM, and EKS.
- Design, implement, and maintain Kubernetes clusters (EKS) and oversee the deployment of containerized applications using best practices for security, scaling, and automation.
- Develop and manage GitOps workflows using ArgoCD for automated, reliable, and auditable application deployments to Kubernetes.
- Write and maintain infrastructure as code (IaC) using tools such as Terraform.
- Build, optimize, and troubleshoot CI/CD pipelines to support rapid, reliable software delivery, integrating with ArgoCD and other modern DevOps toolchains.
- Develop robust automation scripts and tools in languages such as Kotlin, Python, and/or Bash to streamline operational processes, monitoring, and incident response.
- Proactively monitor system performance, reliability, and security, responding to incidents and participating in on-call rotations as needed.
- Collaborate with software engineers to improve deployment strategies, system observability, and overall site reliability.
- Implement and enforce security best practices across all infrastructure and deployment workflows.
- Maintain comprehensive documentation of infrastructure, processes, and procedures for operational transparency and team knowledge sharing.
- Experience using GitHub and GitHub Actions to automate, testing and deployments.
AWSDockerPythonBashGitKotlinKubernetesPrometheusCI/CDLinuxDevOpsTerraformNetworkingAnsibleScripting
Posted 11 days ago
Apply