Site Reliability Engineer

New
IndiaFull-TimeMiddle
Salary not disclosed
Apply NowOpens the employer's application page

Job Details

Experience
4–5 years
Required Skills
AWSPythonBashKubernetesGoGrafanaPrometheusCI/CDTerraformHelm

Requirements

  • 4–5 years of experience in Site Reliability Engineering, DevOps, or similar infrastructure-focused roles.
  • Strong hands-on experience with Kubernetes in production environments.
  • Experience with cloud platforms, especially AWS (EKS, VPC, S3, IAM, ECR).
  • Solid understanding of infrastructure-as-code tools such as Terraform and Git-based workflows.
  • Experience building and managing CI/CD pipelines and automation systems.
  • Proficiency in Helm charts and containerized deployment strategies.
  • Strong scripting skills in Bash and familiarity with at least one programming language (Go or Python preferred).
  • Experience working with distributed systems, microservices architectures, and cloud-native ecosystems.
  • Strong debugging, troubleshooting, and problem-solving skills in production environments.
  • Experience with observability tools such as Grafana, Prometheus, Loki, or similar stacks.

Responsibilities

  • Deploy, manage, and scale distributed systems across multi-region cloud environments with a focus on high availability and performance.
  • Design, maintain, and optimize Kubernetes-based infrastructure for large-scale production workloads.
  • Build and manage Helm charts to enable consistent, automated, and repeatable deployments.
  • Implement and maintain infrastructure-as-code solutions using tools such as Terraform and related automation frameworks.
  • Monitor system health using observability tools such as Grafana, Prometheus, and logging stacks.
  • Collaborate with development teams to improve CI/CD pipelines, deployment reliability, and production readiness.
  • Lead incident response, troubleshooting, and root cause analysis for production issues.
  • Develop and maintain runbooks, operational documentation, and best practices for system reliability.
View Full Description & ApplyYou'll be redirected to the employer's site
View details
Apply Now