Staff Backend Engineer - Adaptive Telemetry
New
Canada, Canadian time zonesFull-TimeStaff
Salary186368 - 223642 CAD per year
Apply NowOpens the employer's application page
Job Details
- Required Skills
- PythonKafkaKubernetesC++GoGrafanaPrometheusRust
Requirements
- Proven delivery of large distributed systems
- Experience shipping and operating complex systems that span multiple teams, with clear evidence of technical leadership and impact
- Strong systems-design instincts
- Deep understanding of tradeoffs around latency, consistency, availability, scaling and cost
- Hands-on cloud and platform experience
- Solid experience with cloud-native architectures (microservices, containers/Kubernetes, IaC)
- Comfortable defining SLOs/SLIs
- Comfortable doing capacity planning, tuning performance, and driving reliability work end-to-end
- Excellent coding and design skills
- Ability to write clear, maintainable, well-tested code and lead technical designs
- Comfort with AI-assisted development
- Practical experience folding AI-powered developer tools into a team’s workflow
- Experience with messaging and telemetry (Kafka, Prometheus/Grafana or equivalents)
- Ability to align cross-functional stakeholders, set priorities and drive outcomes in a remote-first environment
- Clear written and verbal communication that works across engineers and non-technical stakeholders
Responsibilities
- Drive technical strategy and roadmap
- Proactively define the architectural vision, prioritize work that unlocks major product or platform improvements, and influence product and engineering decisions
- Lead end-to-end delivery of large, cross-functional projects
- Own planning, design, execution, rollout and long-term operation of large initiatives
- Own architecture, reliability, performance and cost for critical systems
- Make pragmatic architecture choices that balance scalability, availability, latency and cost while ensuring systems remain maintainable and evolvable
- Define SLOs/SLIs and lead incident response
- Establish measurable reliability targets, run high-severity incident response, lead blameless post-mortems, and drive systemic fixes and automation to prevent recurrence
- Improve observability, automation and operational readiness
- Champion telemetry, alerting, runbooks, capacity planning and automation efforts that reduce toil, speed debugging and lower MTTR
- Align stakeholders and remove blockers
- Coordinate across Product, Design and other teams to align priorities, negotiate tradeoffs, and unblock delivery for large initiatives
- Mentor and grow engineering talent
- Coach senior and mid-level engineers, lead design reviews, raise engineering standards, and help teammates make sound technical tradeoffs
- Represent engineering internally and externally
- Communicate technical strategy clearly to non-engineering stakeholders and represent the team in cross-team planning
View Full Description & ApplyYou'll be redirected to the employer's site