Senior Site Reliability Engineer
New
C
ClickHouseCloud Infrastructure
Singapore(Remote)Full-TimeSenior
Salary not disclosed
Apply NowOpens the employer's application page
Job Details
- Experience
- At least 8 years
- Required Skills
- AWSPythonGCPKubernetesAzureGoTerraformAnsible
Requirements
- Bachelor’s or Master’s degree in Computer Science or related field.
- At least 8 years of experience in Site Reliability Engineering or a related field.
- Hands-on experience with Go and/or Python.
- Strong knowledge of cloud computing platforms such as AWS, Azure, or Google Cloud Platform.
- Hands-on experience with container orchestration tools such as Kubernetes or Docker Swarm.
- Strong experience with automation and configuration management tools such as Ansible, Terraform, or Puppet.
- Strong problem-solving and production debugging skills.
- Excellent communication and interpersonal skills.
Responsibilities
- Collaborate with engineering teams to design scalable, secure, and high-availability systems.
- Establish and manage SLOs and SLAs for ClickHouse Cloud.
- Implement monitoring and alerting for Data Plane, Control Plane, and Core components.
- Refine incident response processes and conduct blameless post-mortems.
- Continuously improve the reliability and performance of ClickHouse services.
- Plan and drive Chaos engineering initiatives.
- Manage on-call processes, performance issue resolution, and escalation coordination.
View Full Description & ApplyYou'll be redirected to the employer's site