Senior Site Reliability Engineer

New
D
DevsuSoftware Engineering
Peru. Argentina. Brazil. Colombia. Guatemala. Honduras, PSTFull-TimeSenior
Salary not disclosed
Apply NowOpens the employer's application page

Job Details

Languages
English
Required Skills
PythonBashGCPKubernetesGrafanaPrometheusLinuxServiceNow

Requirements

  • Strong experience as a Site Reliability Engineer or Reliability Engineer
  • Deep hands-on expertise with Grafana
  • Solid experience with monitoring and observability systems
  • Production experience operating Kubernetes environments
  • Experience supporting systems in GCP and on-prem environments
  • Strong Linux systems and troubleshooting skills
  • Fluent English (written and spoken)
  • Ability to work in PST time zone
  • Ability to participate in an on-call rotation including one weekend day

Responsibilities

  • Own and operate the monitoring and observability stack across on-prem and GCP environments
  • Design, build, and maintain Grafana dashboards for infrastructure, Kubernetes, and applications
  • Define, tune, and maintain alerts to ensure high signal-to-noise ratio
  • Apply SRE principles to improve availability, performance, and resilience
  • Participate in on-call rotations and SEV incident response
  • Support and monitor Kubernetes environments (GKE and on-prem clusters)
  • Provide L2/L3 application support coverage during resource shortages or major incidents
View Full Description & ApplyYou'll be redirected to the employer's site
View details
Apply Now