Staff Backend Engineer - Application Core Services, Stacks

USA, EST, CSTFull-TimeStaff
Salary174,986 - 209,983 USD per year
Apply NowOpens the employer's application page

Job Details

Required Skills
AWSGCPKubernetesAzureGoTerraformHelm

Requirements

  • At least 1 year of fully remote work experience
  • Worked on a big SaaS platform and dealt with common distributed systems problems (e.g., scalability, multi-tenancy, data isolation, HA)
  • Professional experience with Golang and willing to work across both backend service and application code
  • Care deeply about developer and user experience and the quality of the products
  • Experience with delivering projects from gathering requirements, brainstorming ideas to shipping a product to the customer’s hands in a self-driven way
  • Write clean, robust, well-tested software that other engineers can understand, operate, and maintain
  • Experience with mentoring junior engineers in a collaborative but asynchronous environment
  • Can take on complex challenges and break them down to achieve tight learning loops
  • Willing to work across teams and align work with needs of other squads and external stakeholders
  • Strong Kubernetes experience in AWS, GCP, or Azure
  • Familiarity with infrastructure-as-code tooling (Helm, Terraform, Jsonnet)
  • Experience participating in blameless incident response and writing high-quality post-incident reviews

Responsibilities

  • Design, build, and operate reconciliation systems, including the SSS backend, to track desired stack state, detect and repair drift across stack templates, grafana.com state, Hosted Grafana, and actual customer stack configuration
  • Collaborate across SSS, grafana.com, and deployment configurations to ensure stack lifecycle workflows remain reliable, observable, and resilient
  • Improve operational efficiency by reducing deployment complexity (e.g., aiming for single PR regional SSS deployment) and contributing to the Stack Config Reconciliation project
  • Manage rollout mechanisms for provisioned plugins, dashboards, data sources, Grafana versions, release channels, and stack-level configuration
  • Support new region and cluster rollouts, including the operational paths required to bring stacks online safely in new Grafana Cloud regions
  • Improve incident response and recovery paths for stack misalignment, reconciliation failures, plugin rollout issues, and Hosted Grafana integration failures
  • Partner with Product, Hosted Grafana, Infrastructure, Support, and adjacent AppCore squads on customer-impacting stack lifecycle work
  • Contribute to roadmap planning, technical design, OnCall improvements, and long-term simplification of stack operations
  • Own the production behavior of the systems you build, including improving runbooks, dashboards, alerts, reconciliation safety, rollout controls, and recovery procedures
  • Participate in our follow-the-sun OnCall rotation
  • Participating in team decisions, such as roadmap planning and prioritization
View Full Description & ApplyYou'll be redirected to the employer's site
174,986 - 209,983 USD per year
Apply Now