Site Reliability Engineer
S
SupplyHouse.comE-commerce HVAC
Remote from India, 4-5 hours overlap with 8:00 a.m. to 5:00 p.m. U.S. Eastern TimeFull-TimeMiddle
Salary29,000 - 36,000 USD per year
Apply NowOpens the employer's application page
Job Details
- Languages
- English
- Experience
- 3+ years
- Required Skills
- DockerPythonGCPKubernetesGrafanaPrometheusCI/CDLinuxTerraformAnsible
Requirements
- Bachelors degree in Computer Science, Engineering, or a related field
- 3+ years of hands-on experience as an SRE, DevOps, Systems, or Cloud Infrastructure Engineer
- Proven track record managing production-grade systems on Google Cloud Platform (GCP)
- Strong understanding of Linux/Unix system administration, networking, and troubleshooting
- Experience implementing Infrastructure as Code (IaC) using Terraform, Ansible, or Deployment Manager
- Familiarity with containerization and orchestration (Docker, Kubernetes/GKE)
- Experience with monitoring/observability tools (Google Cloud Operations Suite, Prometheus, Grafana, Datadog, ELK)
- Proficiency in at least one scripting language (Python, Bash, or Go)
- Hands-on experience building or managing CI/CD pipelines
- High-level proficiency of written and verbal communication in English
Responsibilities
- Design, build, and maintain scalable, reliable systems on GCP (Compute Engine, GKE, Cloud Storage, Cloud SQL)
- Develop automation for infrastructure provisioning using Terraform, Ansible, or Deployment Manager
- Build and maintain observability platforms using tools such as Stackdriver, Prometheus, or Grafana
- Manage incident response, conduct postmortems, and implement improvements
- Partner with DevOps and engineering teams to enhance CI/CD pipelines
- Define and monitor SLAs, SLOs, and SLIs
- Implement disaster recovery and backup strategies
- Continuously optimize performance, capacity, and cost-efficiency
View Full Description & ApplyYou'll be redirected to the employer's site