Senior Site Reliability Engineer

New
C
CriblCloud Observability
Remote - PolandFull-TimeSenior
Salary not disclosed
Apply NowOpens the employer's application page

Job Details

Required Skills
AWSNode.jsJavascriptKubernetesTypeScriptAzureLinuxTerraformAnsible

Requirements

  • Proven experience designing, implementing, and operating observability systems for complex cloud-based platforms
  • Experience with Configuration Management and Infrastructure as a Code Tools like Terraform (preferred) or Ansible
  • Knowledge of cloud platforms (prefer AWS and Azure) and container + orchestration technologies
  • Experience with APM and Observability tools such as New Relic, Splunk, CloudWatch, Prometheus, Grafana/Kibana, Sentry
  • Extensive experience with enterprise scale continuous delivery environments
  • Development with JavaScript/Node.js/TypeScript in a Linux/Mac environment
  • Experience with sustainable incident response in a blameless environment
  • Background in Linux Systems Engineering
  • Experience with Incident response related tools (e.g., PagerDuty, FireHydrant, Blameless)

Responsibilities

  • Engage with teams and improve service delivery and reliability across their entire lifecycle
  • Measure and monitor all production systems with an eye towards availability, latency and overall system health
  • Seek out the cause of errors and instability in our production cloud services and drive teams towards better operational excellence
  • Engage with product and platform teams to improve and evolve systems by lobbying for changes that improve reliability, resilience, and observability
  • Help identify and drive down toil with creative innovation and automation
View Full Description & ApplyYou'll be redirected to the employer's site
View details
Apply Now