Senior Site Reliability Engineer

Argentina, Brazil, Chile, Colombia, Costa RicaContractSenior
Salary not disclosed
Apply NowOpens the employer's application page

Job Details

Required Skills
AWSDockerPythonSQLGCPTerraformGitHub ActionsDatadog

Requirements

  • Experience with Infrastructure as Code using Terraform and GitHub CI/CD.
  • Containerization skills using Docker and ECS.
  • Proficiency in managing and troubleshooting OS, storage, and networking (VPCs, proxies, CDNs).
  • Expertise in administering high-availability datastores (mySQL, Postgres, Neo4J, Redis).
  • Monitoring and instrumentation experience with Datadog, Sentry, and log management.
  • Knowledge of engineering practices including availability, reliability, scalability, and disaster recovery.
  • Proficiency in Shell, IaC, Python, and SQL.
  • Familiarity with Agile methodologies.
  • Ability to work asynchronously and manage personal/team workload.
  • Strong communication skills for design documentation and RCA investigations.

Responsibilities

  • Participate in on-call rotation to respond to incidents and support developers.
  • Maintain and extend infrastructure using Terraform, GitHub Actions, Prefect, and AWS services.
  • Build monitoring systems using Datadog, Sentry, and CloudWatch.
  • Automate repeatable manual actions to reduce toil.
  • Improve operational processes including deployments, releases, and migrations.
  • Design and maintain AWS and GCP cloud infrastructure.
  • Debug production issues across all stack levels.
  • Provide infrastructure and architectural planning support.
  • Plan for infrastructure growth.
View Full Description & ApplyYou'll be redirected to the employer's site
View details
Apply Now