Site Reliability Engineering Manager

Posted 6 months agoViewed
132439 - 208378 USD per year
AustraliaAustriaBangladeshBelgiumBrazilCanadaColombiaCosta RicaCroatiaCzech RepublicDenmarkEgyptEstoniaFinlandFranceGermanyGhanaGreeceIndiaIndonesiaIrelandIsraelItalyKenyaMexicoNetherlandsNigeriaPeruPolandSingaporeSouth AfricaSpainSwedenSwitzerlandUgandaUnited KingdomUnited States of AmericaUruguayFull-TimeNonprofit, Free Knowledge, Software
Company:Wikimedia Foundation
Location:Australia, Austria, Bangladesh, Belgium, Brazil, Canada, Colombia, Costa Rica, Croatia, Czech Republic, Denmark, Egypt, Estonia, Finland, France, Germany, Ghana, Greece, India, Indonesia, Ireland, Israel, Italy, Kenya, Mexico, Netherlands, Nigeria, Peru, Poland, Singapore, South Africa, Spain, Sweden, Switzerland, Uganda, United Kingdom, United States of America, Uruguay, UTC-6, UTC-5, UTC-4, UTC-3
Languages:English
Seniority level:Manager, Prior hands-on experience with software or reliability engineering (within the last 3 years preferred)
Experience:Prior hands-on experience with software or reliability engineering (within the last 3 years preferred)
Skills:
DockerProject ManagementCloud ComputingKubernetesProject CoordinationLinuxDevOpsTerraformAnsibleNetworking
Requirements:
Prior experience managing teams Prior hands-on experience with software or reliability engineering (within the last 3 years preferred) Ability to analyze complex systems, troubleshoot issues, and devise effective solutions under pressure Proficiency in project management methodologies to effectively plan, execute, and track new and existing initiatives Strong understanding of cloud computing, networking, Linux systems administration, containerization (e.g., Docker, Kubernetes), and infrastructure as code (e.g., Terraform, Ansible) to be able to provide technical support to the team Aptitude for automation and streamlining of tasks Communicate effectively in both spoken and written English Ability to work independently, as an effective part of a globally distributed team Ability to travel several times a year for occasional in-person meetings B.S. or M.S. in Computer Science or the equivalent in related work experience
Responsibilities:
Managing one to two globally distributed teams within Wikimedia’s Site Reliability Engineering organization Providing guidance, mentorship, and support to ensure the team's effectiveness and growth Working with team members to set individual performance goals, and supporting them in meeting and evolving their goals and career path Recruiting, hiring, and helping onboard new team members Triaging incoming workload, maintaining focus on priorities, and setting realistic expectations for both peers and team members Coordinating and communicating with other members of the Wikimedia product & engineering teams on relevant projects, executing complex projects and contributing to the organizational strategy Continuously developing the roadmap of the team in alignment with other SRE and Product & Technology teams, and helping to draft and execute the team’s annual and quarterly plans Project managing new and existing initiatives Leading the definition, refinement, and execution of the processes through which the team manages and performs work Leading incident response, diagnosis, and follow-up on system alerts and outages across Wikimedia’s production infrastructure Be part of 24/7 on-call rotation to handle escalations and provide support for teams to resolve issues Facilitating the definition and establishment of Service Level Indicators and Objectives with service owners and stakeholders
About the Company
Wikimedia Foundation
251-500 employees
View Company Profile
Similar Jobs:
Posted 19 days ago
United States, CanadaFull-TimeSoftware Development
Manager, Site Reliability Engineering
Company:Jellyvision
Posted over 1 year ago
APAC, EMEAFull-TimeSoftware Development
Site Reliability Engineering Manager
Company:Canonical
Posted about 1 month ago
Canada, LATAMFull-TimeSoftware Development
Senior Engineering Manager, Site Reliability