Senior Site Reliability Engineer
New
India, ESTFull-TimeSenior
Salary not disclosed
Apply NowOpens the employer's application page
Job Details
- Experience
- 5+ years
- Required Skills
- AWSDockerPythonBashKubernetesCI/CDTerraformDatadog
Requirements
- 5+ years of experience in Site Reliability Engineering, DevOps, or cloud infrastructure engineering roles
- Strong hands-on experience with AWS and container orchestration platforms such as Kubernetes (EKS/GKE)
- Solid experience with CI/CD tools such as GitHub Actions, CircleCI, Argo Workflows, or similar systems
- Proficiency in Infrastructure as Code using Terraform or equivalent tools
- Strong programming or scripting experience in Python, Bash, or similar languages
- Experience with monitoring and observability tools such as Datadog, Sentry, or OpenSearch
- Understanding of authentication, authorization, and secure cloud infrastructure design
- Ability to work independently in a high-ownership, production-critical environment
- Strong problem-solving skills with experience in debugging complex distributed systems
Responsibilities
- Own and operate cloud infrastructure ensuring high availability, performance, and reliability of production systems
- Design, implement, and maintain CI/CD pipelines, deployment automation, and DevOps tooling using modern practices
- Manage and scale Kubernetes-based environments (EKS/GKE) and containerized workloads using Docker
- Build and maintain infrastructure-as-code solutions using Terraform and other automation frameworks
- Implement and improve observability systems including monitoring, logging, alerting, and incident response processes
- Participate in on-call rotations, root cause analysis (RCA), and post-incident reviews to improve system resilience
- Collaborate with engineering teams to troubleshoot production issues and guide architecture and deployment decisions
View Full Description & ApplyYou'll be redirected to the employer's site