Staff Backend Engineer - Adaptive Telemetry

New
Canada, Canadian time zonesFull-TimeStaff
Salary186368 - 223642 CAD per year
Apply NowOpens the employer's application page

Job Details

Required Skills
PythonKafkaKubernetesC++GoGrafanaPrometheusRust

Requirements

  • Proven delivery of large distributed systems
  • Experience shipping and operating complex systems that span multiple teams, with clear evidence of technical leadership and impact
  • Strong systems-design instincts
  • Deep understanding of tradeoffs around latency, consistency, availability, scaling and cost
  • Hands-on cloud and platform experience
  • Solid experience with cloud-native architectures (microservices, containers/Kubernetes, IaC)
  • Comfortable defining SLOs/SLIs
  • Comfortable doing capacity planning, tuning performance, and driving reliability work end-to-end
  • Excellent coding and design skills
  • Ability to write clear, maintainable, well-tested code and lead technical designs
  • Comfort with AI-assisted development
  • Practical experience folding AI-powered developer tools into a team’s workflow
  • Experience with messaging and telemetry (Kafka, Prometheus/Grafana or equivalents)
  • Ability to align cross-functional stakeholders, set priorities and drive outcomes in a remote-first environment
  • Clear written and verbal communication that works across engineers and non-technical stakeholders

Responsibilities

  • Drive technical strategy and roadmap
  • Proactively define the architectural vision, prioritize work that unlocks major product or platform improvements, and influence product and engineering decisions
  • Lead end-to-end delivery of large, cross-functional projects
  • Own planning, design, execution, rollout and long-term operation of large initiatives
  • Own architecture, reliability, performance and cost for critical systems
  • Make pragmatic architecture choices that balance scalability, availability, latency and cost while ensuring systems remain maintainable and evolvable
  • Define SLOs/SLIs and lead incident response
  • Establish measurable reliability targets, run high-severity incident response, lead blameless post-mortems, and drive systemic fixes and automation to prevent recurrence
  • Improve observability, automation and operational readiness
  • Champion telemetry, alerting, runbooks, capacity planning and automation efforts that reduce toil, speed debugging and lower MTTR
  • Align stakeholders and remove blockers
  • Coordinate across Product, Design and other teams to align priorities, negotiate tradeoffs, and unblock delivery for large initiatives
  • Mentor and grow engineering talent
  • Coach senior and mid-level engineers, lead design reviews, raise engineering standards, and help teammates make sound technical tradeoffs
  • Represent engineering internally and externally
  • Communicate technical strategy clearly to non-engineering stakeholders and represent the team in cross-team planning
View Full Description & ApplyYou'll be redirected to the employer's site
186368 - 223642 CAD per year
Apply Now