Senior Site Reliability Engineer
Argentina, Brazil, Chile, Colombia, Costa RicaContractSenior
Salary not disclosed
Apply NowOpens the employer's application page
Job Details
- Required Skills
- AWSDockerPythonSQLGCPTerraformGitHub ActionsDatadog
Requirements
- Experience with Infrastructure as Code using Terraform and GitHub CI/CD.
- Containerization skills using Docker and ECS.
- Proficiency in managing and troubleshooting OS, storage, and networking (VPCs, proxies, CDNs).
- Expertise in administering high-availability datastores (mySQL, Postgres, Neo4J, Redis).
- Monitoring and instrumentation experience with Datadog, Sentry, and log management.
- Knowledge of engineering practices including availability, reliability, scalability, and disaster recovery.
- Proficiency in Shell, IaC, Python, and SQL.
- Familiarity with Agile methodologies.
- Ability to work asynchronously and manage personal/team workload.
- Strong communication skills for design documentation and RCA investigations.
Responsibilities
- Participate in on-call rotation to respond to incidents and support developers.
- Maintain and extend infrastructure using Terraform, GitHub Actions, Prefect, and AWS services.
- Build monitoring systems using Datadog, Sentry, and CloudWatch.
- Automate repeatable manual actions to reduce toil.
- Improve operational processes including deployments, releases, and migrations.
- Design and maintain AWS and GCP cloud infrastructure.
- Debug production issues across all stack levels.
- Provide infrastructure and architectural planning support.
- Plan for infrastructure growth.
View Full Description & ApplyYou'll be redirected to the employer's site