4+ years of experience designing, developing and launching backend systems at scale using scripting and development languages like Bash, Python or Kotlin
Track record of developing highly available distributed systems using technologies like AWS, MySQL and Kubernetes
Meaningful experience contributing in or driving parts of the Incident Lifecycle process
4+ years working in a Site Reliability or Production Engineering team
Experience defining a technical plan for the delivery of a significant feature or system component with an elegant, simple and extensible design
Experience in making impactful changes in a large code base
Strong verbal and written communication skills
Responsibilities:
Owning and delivering quarterly goals for the team
Leading engineers through ambiguity to solve open-ended problems
Supporting peers and stakeholders in the product development lifecycle
Proactively identifying technical solutions and operational processes
Supporting the operations and availability of team's artifacts
Fostering a culture of quality and ownership on the team