Senior Site Reliability Engineer

Argentina, Brazil, Chile, Colombia, Costa RicaContractSenior

Salary not disclosed

Apply NowOpens the employer's application page

Job Details

Experience with Infrastructure as Code using Terraform and GitHub CI/CD.
Containerization skills using Docker and ECS.
Proficiency in managing and troubleshooting OS, storage, and networking (VPCs, proxies, CDNs).
Expertise in administering high-availability datastores (mySQL, Postgres, Neo4J, Redis).
Monitoring and instrumentation experience with Datadog, Sentry, and log management.
Knowledge of engineering practices including availability, reliability, scalability, and disaster recovery.
Proficiency in Shell, IaC, Python, and SQL.
Familiarity with Agile methodologies.
Ability to work asynchronously and manage personal/team workload.
Strong communication skills for design documentation and RCA investigations.

Participate in on-call rotation to respond to incidents and support developers.
Maintain and extend infrastructure using Terraform, GitHub Actions, Prefect, and AWS services.
Build monitoring systems using Datadog, Sentry, and CloudWatch.
Automate repeatable manual actions to reduce toil.
Improve operational processes including deployments, releases, and migrations.
Design and maintain AWS and GCP cloud infrastructure.
Debug production issues across all stack levels.
Provide infrastructure and architectural planning support.
Plan for infrastructure growth.

View Full Description & ApplyYou'll be redirected to the employer's site