Infrastructure Engineer
New
Brazil, Americas time zonesFull-TimeMiddle
SalaryCompetitive USD-based compensation aligned with experience and impact.
Apply NowOpens the employer's application page
Job Details
- Required Skills
- AWSPythonGCPKubernetesAzureGoGrafanaPrometheusRDBMSLinuxTerraformAnsibleHelm
Requirements
- Experience operating mission-critical production systems with ownership of reliability, uptime, and SLAs.
- Strong hands-on experience with Kubernetes or similar orchestration platforms in production environments.
- Solid background in Infrastructure-as-Code tools such as Terraform, Helm, and Ansible.
- Experience with public cloud providers such as AWS, GCP, or Azure in production workloads.
- Proficiency in Linux systems, shell usage, and systems-level troubleshooting.
- Experience with monitoring and observability tools such as Prometheus, Grafana, or similar stacks.
- Programming ability in Python and/or Go for automation and tooling.
- Strong systems thinking mindset with the ability to anticipate edge cases, failure modes, and scaling challenges.
- Experience with databases (RDBMS) in production environments.
- A proactive, ownership-driven mindset with strong curiosity and a bias toward fixing and improving systems.
- Strong documentation habits and ability to communicate technical knowledge clearly.
- CKA certification (or equivalent Kubernetes expertise) is highly valued.
- Ability to work aligned with Americas time zones.
Responsibilities
- Ensure the reliability, scalability, and performance of mission-critical production systems, including participation in on-call rotations and incident response activities.
- Design, operate, and improve complex hybrid infrastructure environments, primarily multi-cloud Kubernetes clusters, using Infrastructure-as-Code practices.
- Monitor system health, improve observability, and enhance alerting systems to proactively detect and prevent incidents.
- Investigate production issues, identify root causes, and implement both immediate mitigations and long-term fixes for system weaknesses.
- Automate repetitive operational tasks to reduce manual workload and improve system efficiency.
- Plan and implement infrastructure improvements aligned with reliability, scalability, and performance goals.
- Improve technical documentation with clear explanations of system design decisions, not just procedural steps.
- Share knowledge across teams and contribute to raising overall infrastructure engineering maturity.
View Full Description & ApplyYou'll be redirected to the employer's site