Senior Engineering Manager, Reliability
New
T
TwilioCommunications/Platform Engineering
Remote - IrelandFull-TimeManager
Salary not disclosed
Apply NowOpens the employer's application page
Job Details
- Experience
- 5+ years of management experience, including 3+ years leading a team focused on Reliability Engineering
- Required Skills
- AWSDockerKafkaKubernetesTerraformDistributed Systems
Requirements
- 5+ years of management experience, including 3+ years leading a team focused on Reliability Engineering.
- Proven track record managing and responding to incidents.
- Experience with reliability modeling in distributed systems (failure mode analysis, chaos engineering, graceful degradation, automated recovery).
- Operational experience with complex distributed systems, including defining/monitoring SLIs and SLOs.
- Experience with Kubernetes, Docker, Kafka, and Terraform.
- Strong product and architectural vision.
- Excellent written and oral communication skills.
- Ability to work effectively in multi-functional teams.
Responsibilities
- Drive a culture focused on high availability and reliability for customer-facing services.
- Empower and mentor a team of skilled SRE engineers.
- Lead career development for junior and senior team members.
- Collaborate on best practices for building and operating services at scale in AWS.
- Drive technical roadmaps and modernization initiatives.
- Leverage metrics, SLIs, and SLOs to identify and address system gaps.
- Lead post-incident reviews and implement follow-up actions.
- Conduct audits and risk assessments for security and reliability.
- Increase automation to reduce toil in deployment and operations.
View Full Description & ApplyYou'll be redirected to the employer's site