Senior Site Reliability Engineer

New
C
ClickHouseCloud Infrastructure
Singapore(Remote)Full-TimeSenior
Salary not disclosed
Apply NowOpens the employer's application page

Job Details

Experience
At least 8 years
Required Skills
AWSPythonGCPKubernetesAzureGoTerraformAnsible

Requirements

  • Bachelor’s or Master’s degree in Computer Science or related field.
  • At least 8 years of experience in Site Reliability Engineering or a related field.
  • Hands-on experience with Go and/or Python.
  • Strong knowledge of cloud computing platforms such as AWS, Azure, or Google Cloud Platform.
  • Hands-on experience with container orchestration tools such as Kubernetes or Docker Swarm.
  • Strong experience with automation and configuration management tools such as Ansible, Terraform, or Puppet.
  • Strong problem-solving and production debugging skills.
  • Excellent communication and interpersonal skills.

Responsibilities

  • Collaborate with engineering teams to design scalable, secure, and high-availability systems.
  • Establish and manage SLOs and SLAs for ClickHouse Cloud.
  • Implement monitoring and alerting for Data Plane, Control Plane, and Core components.
  • Refine incident response processes and conduct blameless post-mortems.
  • Continuously improve the reliability and performance of ClickHouse services.
  • Plan and drive Chaos engineering initiatives.
  • Manage on-call processes, performance issue resolution, and escalation coordination.
View Full Description & ApplyYou'll be redirected to the employer's site
View details
Apply Now