Senior Engineering Manager, Reliability

New
T
TwilioCommunications/Platform Engineering
Remote - IrelandFull-TimeManager
Salary not disclosed
Apply NowOpens the employer's application page

Job Details

Experience
5+ years of management experience, including 3+ years leading a team focused on Reliability Engineering
Required Skills
AWSDockerKafkaKubernetesTerraformDistributed Systems

Requirements

  • 5+ years of management experience, including 3+ years leading a team focused on Reliability Engineering.
  • Proven track record managing and responding to incidents.
  • Experience with reliability modeling in distributed systems (failure mode analysis, chaos engineering, graceful degradation, automated recovery).
  • Operational experience with complex distributed systems, including defining/monitoring SLIs and SLOs.
  • Experience with Kubernetes, Docker, Kafka, and Terraform.
  • Strong product and architectural vision.
  • Excellent written and oral communication skills.
  • Ability to work effectively in multi-functional teams.

Responsibilities

  • Drive a culture focused on high availability and reliability for customer-facing services.
  • Empower and mentor a team of skilled SRE engineers.
  • Lead career development for junior and senior team members.
  • Collaborate on best practices for building and operating services at scale in AWS.
  • Drive technical roadmaps and modernization initiatives.
  • Leverage metrics, SLIs, and SLOs to identify and address system gaps.
  • Lead post-incident reviews and implement follow-up actions.
  • Conduct audits and risk assessments for security and reliability.
  • Increase automation to reduce toil in deployment and operations.
View Full Description & ApplyYou'll be redirected to the employer's site
View details
Apply Now