Sr Site Reliability Engineer

IndiaFull-TimeSenior
Salary5,000,000 - 10,000,000 INR per year
Apply NowOpens the employer's application page

Job Details

Experience
5–8 years
Required Skills
KafkaKubernetesClickhouseGoCI/CDDistributed Systems

Requirements

  • 5–8 years of experience in SRE, infrastructure, platform engineering, or backend systems roles
  • Deep hands-on expertise with Kubernetes in production-scale environments
  • Strong understanding of distributed systems, failure modes, performance tuning, and capacity planning
  • Experience working with high-scale data systems (ClickHouse, Kafka, or similar)
  • Proficiency in at least one programming language (Go strongly preferred)
  • Familiarity with observability concepts and tools such as OpenTelemetry, metrics, logs, and traces
  • Strong problem-solving skills with the ability to debug complex production issues
  • Excellent communication skills with the ability to write clear documentation and runbooks
  • Experience in fast-paced, high-ownership, remote-first environments

Responsibilities

  • Design, operate, and improve large-scale Kubernetes infrastructure including upgrades, scaling, networking, and multi-tenancy
  • Ensure system reliability through strong SRE practices including SLOs, SLIs, error budgets, incident response, and on-call optimization
  • Scale and maintain high-throughput ingestion pipelines handling petabyte-scale observability data
  • Operate, tune, and optimize data systems such as ClickHouse for performance, cost efficiency, and reliability
  • Build automation and tooling using infrastructure-as-code and CI/CD to improve deployment and operational efficiency
  • Monitor, debug, and resolve complex production issues across distributed systems
  • Improve observability of the platform itself using modern monitoring, logging, and tracing practices
View Full Description & ApplyYou'll be redirected to the employer's site
5,000,000 - 10,000,000 INR per year
Apply Now