Staff Backend Engineer - Adaptive Telemetry
New
USA, United States time zonesFull-TimeStaff
Salary174,986 - 209,983 USD per year
Apply NowOpens the employer's application page
Job Details
- Required Skills
- PythonKafkaKubernetesC++GoGrafanaPrometheusRustMicroservicesDistributed Systems
Requirements
- Proven delivery of large distributed systems. Experience shipping and operating complex systems that span multiple teams, with clear evidence of technical leadership and impact.
- Strong systems-design instincts. Deep understanding of tradeoffs around latency, consistency, availability, scaling and cost.
- Hands-on cloud and platform experience. Solid experience with cloud-native architectures (microservices, containers/Kubernetes, IaC) and the operational practices that keep them healthy.
- Reliability and performance ownership. Comfortable defining SLOs/SLIs, doing capacity planning, tuning performance, and driving reliability work end-to-end.
- Excellent coding and design skills. You write clear, maintainable, well-tested code and can lead technical designs — we use Go, but Python/C/C++/Rust or similar translate well.
- Comfort with AI-assisted development. We embrace AI and agentic development so we expect you to be curious and comfortable using AI-powered developer tools and ideally have practical experience folding them into a team’s workflow.
- Experience with messaging and telemetry. Familiarity with streaming/messaging systems (e.g., Kafka) and observability tooling (Prometheus/Grafana or equivalents).
- Influence without authority. Ability to align cross-functional stakeholders, set priorities and drive outcomes in a remote-first environment.
- Strong communicator. Clear written and verbal communication that works across engineers and non-technical stakeholders.
Responsibilities
- Drive technical strategy and roadmap. Proactively define the architectural vision, prioritize work that unlocks major product or platform improvements, and influence product and engineering decisions.
- Lead end-to-end delivery of large, cross-functional projects. Own planning, design, execution, rollout and long-term operation of large initiatives.
- Own architecture, reliability, performance and cost for critical systems. Make pragmatic architecture choices that balance scalability, availability, latency and cost while ensuring systems remain maintainable and evolvable.
- Define SLOs/SLIs and lead incident response. Establish measurable reliability targets, run high-severity incident response, lead blameless post-mortems, and drive systemic fixes and automation to prevent recurrence.
- Improve observability, automation and operational readiness. Champion telemetry, alerting, runbooks, capacity planning and automation efforts that reduce toil, speed debugging and lower MTTR.
- Align stakeholders and remove blockers. Coordinate across Product, Design and other teams to align priorities, negotiate tradeoffs, and unblock delivery for large initiatives.
- Mentor and grow engineering talent. Coach senior and mid-level engineers, lead design reviews, raise engineering standards, and help teammates make sound technical tradeoffs.
- Represent engineering internally and externally. Communicate technical strategy clearly to non-engineering stakeholders and represent the team in cross-team planning.
View Full Description & ApplyYou'll be redirected to the employer's site