ApplyStaff Site Reliability Engineer
Posted 2 months agoViewed
View full description
Requirements:
- 6+ years in SRE, DevOps, or related roles.
- Strong experience managing and optimizing Kubernetes clusters.
- Proven expertise in designing and implementing automation solutions, including Terraform and Helm.
- Strong programming skills in Shell and Python.
- Extensive experience with Linux system administration and network management.
- Expertise in managing distributed computing systems.
- Fluency in English with solid communication skills.
Responsibilities:
- Own the site reliability process and systems through design, implementation, deployment, and maintenance.
- Ensure scalability, resilience, and performance of solutions across SaaS and client-hosted environments.
- Design and implement automation workflows to streamline operations.
- Ensure security and compliance of infrastructure and processes.
- Collaborate with cross-functional teams on requirements and solutions.
- Document architecture and operational procedures.
- Participate in on-call rotations for incident management.
Apply