Staff Software Engineer, Infrastructure

Based in the United StatesFull-TimeStaff
Salary not disclosed
Apply NowOpens the employer's application page

Job Details

Experience
8+ years
Required Skills
KubernetesGoGrafanaCI/CDLinuxTerraform

Requirements

  • 8+ years of professional software engineering experience in backend, infrastructure, or platform engineering roles.
  • Strong hands-on expertise in Go or similar backend languages.
  • Proven experience building, scaling, and operating production infrastructure or cloud-based platforms.
  • Deep knowledge in at least one of: Kubernetes, cloud infrastructure, networking, reliability engineering, or developer platforms.
  • Strong understanding of Linux systems, networking fundamentals, and production operations at scale.
  • Experience driving cross-team alignment and influencing technical direction through design documents, RFCs, and architecture reviews.
  • Familiarity with modern DevOps practices such as Terraform, CI/CD pipelines, GitOps, and observability tooling (Prometheus, OpenTelemetry, Grafana).
  • Strong communication skills for a distributed, remote-first environment.

Responsibilities

  • Define and lead the evolution of internal infrastructure platforms by turning ambiguous technical challenges into scalable architectural proposals and driving them through RFCs and cross-team alignment.
  • Design and build self-service platform capabilities and APIs (primarily in Go) for provisioning, onboarding, deployment, observability, and operational workflows.
  • Establish and improve delivery standards using Terraform, GitOps (Argo CD), CI/CD pipelines, and progressive deployment strategies.
  • Architect and evolve multi-region, multi-account infrastructure on Kubernetes (EKS), including networking, ingress, traffic routing, and cross-region connectivity.
  • Improve platform reliability and operational maturity through enhanced SLOs, monitoring, alerting, and incident management practices using observability tools.
  • Drive adoption of platform capabilities across engineering teams by ensuring solutions are usable and reduce operational friction.
  • Participate in on-call rotations while also improving operational health through better alerting, runbooks, and long-term reliability improvements.
View Full Description & ApplyYou'll be redirected to the employer's site
View details
Apply Now