Infrastructure Engineer

New
Brazil, Americas time zonesFull-TimeMiddle
SalaryCompetitive USD-based compensation aligned with experience and impact.
Apply NowOpens the employer's application page

Job Details

Required Skills
AWSPythonGCPKubernetesAzureGoGrafanaPrometheusRDBMSLinuxTerraformAnsibleHelm

Requirements

  • Experience operating mission-critical production systems with ownership of reliability, uptime, and SLAs.
  • Strong hands-on experience with Kubernetes or similar orchestration platforms in production environments.
  • Solid background in Infrastructure-as-Code tools such as Terraform, Helm, and Ansible.
  • Experience with public cloud providers such as AWS, GCP, or Azure in production workloads.
  • Proficiency in Linux systems, shell usage, and systems-level troubleshooting.
  • Experience with monitoring and observability tools such as Prometheus, Grafana, or similar stacks.
  • Programming ability in Python and/or Go for automation and tooling.
  • Strong systems thinking mindset with the ability to anticipate edge cases, failure modes, and scaling challenges.
  • Experience with databases (RDBMS) in production environments.
  • A proactive, ownership-driven mindset with strong curiosity and a bias toward fixing and improving systems.
  • Strong documentation habits and ability to communicate technical knowledge clearly.
  • CKA certification (or equivalent Kubernetes expertise) is highly valued.
  • Ability to work aligned with Americas time zones.

Responsibilities

  • Ensure the reliability, scalability, and performance of mission-critical production systems, including participation in on-call rotations and incident response activities.
  • Design, operate, and improve complex hybrid infrastructure environments, primarily multi-cloud Kubernetes clusters, using Infrastructure-as-Code practices.
  • Monitor system health, improve observability, and enhance alerting systems to proactively detect and prevent incidents.
  • Investigate production issues, identify root causes, and implement both immediate mitigations and long-term fixes for system weaknesses.
  • Automate repetitive operational tasks to reduce manual workload and improve system efficiency.
  • Plan and implement infrastructure improvements aligned with reliability, scalability, and performance goals.
  • Improve technical documentation with clear explanations of system design decisions, not just procedural steps.
  • Share knowledge across teams and contribute to raising overall infrastructure engineering maturity.
View Full Description & ApplyYou'll be redirected to the employer's site
Competitive USD-based compensation aligned with experience and impact.
Apply Now