Apply

Senior Staff Software Engineer (Reliability)

Posted 2024-11-13

View full description

💎 Seniority level: Senior, 10+ years of software development experience

📍 Location: Canada

💸 Salary: 206000 - 256000 CAD per year

🔍 Industry: Financial technology

🗣️ Languages: English

⏳ Experience: 10+ years of software development experience

🪄 Skills: AWSLeadershipPythonSoftware DevelopmentJavaKotlinProduct ManagementC++GolangRustMentoringCoaching

Requirements:
  • 10+ years of software development experience in languages such as Python, Kotlin, Rust, Java, C++, or GoLang.
  • At least 5+ years of experience in two different SRE organizational structures.
  • 5+ years of hands-on work in infrastructure and scaling distributed systems.
  • 5+ years of technical leadership in infrastructure, reliability, and software engineering.
  • Strong experience with Kubernetes and AWS in production.
  • Ability to communicate effectively with engineering teams.
  • Deep knowledge of incident management and developing SLIs and SLOs.
Responsibilities:
  • Create and champion a long-term technical roadmap for reliability practices.
  • Promote a culture of ownership and data-driven decision-making.
  • Elevate architecture and design with resiliency focus.
  • Influence Infrastructure teams on reliability guidance.
  • Drive investigations of complex issues.
  • Engage with product management for improved insights.
  • Support growth through hiring and mentoring.
  • Foster technical excellence and constant improvement.
  • Lead incident management implementation.
Apply

Related Jobs

Apply

📍 Canada

🧭 Full-Time

💸 206000 - 256000 CAD per year

🔍 Financial services

  • 10+ years of software development experience, including Python, Kotlin, Rust, Java, C++, or GoLang.
  • 5+ years in at least two SRE organizational structures.
  • 5+ years of hands-on infrastructure experience and scaling distributed systems.
  • 5+ years of technical leadership in SWE and Reliability teams.
  • Strong hands-on experience with Kubernetes and AWS in production.
  • Ability to communicate decisions and practices effectively.
  • Experience in developing effective SLIs and SLOs.
  • Deep knowledge of incident management and post-incident review.

  • Create and champion a long-term technical roadmap for reliability practices across Affirm.
  • Promote a culture of ownership, curiosity, and data-driven decision-making.
  • Coach team members on resilient architecture, technical design, and code reviews.
  • Influence Infrastructure teams to improve customer experience.
  • Drive investigations of complex issues involving software and systems.
  • Support product management to enhance development velocity and reliability.
  • Foster a culture of technical excellence and continuous improvement.
  • Provide leadership in incident management and reliability principles.
  • Focus on human interactions to enable quicker incident resolution.

AWSLeadershipPythonSoftware DevelopmentProduct ManagementC++GolangRustC (Programming language)

Posted 2024-07-31
Apply