DevOps Engineer - Platform Reliability

New
B
Bjak Fintech / Insurance
Remote, China, Collaboration with Malaysia-based teamsFull-TimeMiddle
Salary not disclosed
Apply NowOpens the employer's application page

Job Details

Required Skills
DockerKubernetesPrometheusCI/CDTerraformAnsible

Requirements

  • Experience in DevOps, SRE, platform engineering or infrastructure-focused roles.
  • Strong understanding of cloud infrastructure, CI/CD pipelines and deployment systems.
  • Experience with production monitoring, alerting and incident management practices.
  • Ability to troubleshoot infrastructure and production issues in a structured and calm manner.
  • Strong understanding of reliability engineering principles (availability, fault tolerance, recovery).
  • Experience supporting business-critical or high-availability systems.
  • Strong ownership mindset during incidents and operational failures.
  • Practical judgment on reliability, performance, security and cost trade-offs.
  • Comfortable working closely with engineering teams in fast-paced environments.

Responsibilities

  • Own and improve platform reliability across production systems and environments.
  • Manage cloud infrastructure, deployment pipelines and runtime environments.
  • Design and improve CI/CD workflows to enable safe, fast and repeatable releases.
  • Build and enhance monitoring, alerting, logging and system observability.
  • Lead incident response efforts and perform structured root cause analysis.
  • Improve system resilience through redundancy, failover and recovery mechanisms.
  • Work with engineering teams to reduce production risk through better deployment and system design practices.
  • Strengthen infrastructure security, access control and secrets management.
View Full Description & ApplyYou'll be redirected to the employer's site
View details
Apply Now