Senior Staff Production Engineer
New
Z
ZscalerCloud Infrastructure
This role is available as a hybrid opportunity 3 days a week in San Jose, CA or as a remote positionFull-TimeSenior
Salary140,000 - 200,000 USD per year
Apply NowOpens the employer's application page
Job Details
- Experience
- 8+ years
- Required Skills
- AWSPythonGCPAzureGoGrafanaPrometheusLinux
Requirements
- 8+ years of experience managing reliability
- Scalability
- Availability for large-scale production services
- Deep expertise in programming (Python
- Go
- Or C/C++)
- Strong background in networking protocols
- Linux/FreeBSD systems
- Distributed architecture
- Experience in high-stakes incident management and participation in a 24/7 on-call rotation
- Proficiency in leveraging ITIL frameworks.
Responsibilities
- Design and implement highly available
- Scalable infrastructure across AWS
- Azure
- GCP
- Bare-metal environments
- Drive an 'automation-first' culture by writing code (Python/Go) to eliminate manual toil and build self-healing systems
- Implement and maintain observability
- Define SLIs/SLOs
- Establish error budgets
- Act as a lead Incident Commander
- Develop response playbooks
- Conduct deep-dive post-incident analyses.
View Full Description & ApplyYou'll be redirected to the employer's site