DevOps Engineer - Platform Reliability
New
B
Bjak Fintech / Insurance
Remote, China, Collaboration with Malaysia-based teamsFull-TimeMiddle
Salary not disclosed
Apply NowOpens the employer's application page
Job Details
- Required Skills
- DockerKubernetesPrometheusCI/CDTerraformAnsible
Requirements
- Experience in DevOps, SRE, platform engineering or infrastructure-focused roles.
- Strong understanding of cloud infrastructure, CI/CD pipelines and deployment systems.
- Experience with production monitoring, alerting and incident management practices.
- Ability to troubleshoot infrastructure and production issues in a structured and calm manner.
- Strong understanding of reliability engineering principles (availability, fault tolerance, recovery).
- Experience supporting business-critical or high-availability systems.
- Strong ownership mindset during incidents and operational failures.
- Practical judgment on reliability, performance, security and cost trade-offs.
- Comfortable working closely with engineering teams in fast-paced environments.
Responsibilities
- Own and improve platform reliability across production systems and environments.
- Manage cloud infrastructure, deployment pipelines and runtime environments.
- Design and improve CI/CD workflows to enable safe, fast and repeatable releases.
- Build and enhance monitoring, alerting, logging and system observability.
- Lead incident response efforts and perform structured root cause analysis.
- Improve system resilience through redundancy, failover and recovery mechanisms.
- Work with engineering teams to reduce production risk through better deployment and system design practices.
- Strengthen infrastructure security, access control and secrets management.
View Full Description & ApplyYou'll be redirected to the employer's site