Site Reliability Engineer

New
V
Veeam SoftwareData and AI
Remote, United StatesFull-TimeMiddle
Salary109,800 - 252,500 USD per year OTE
Apply NowOpens the employer's application page

Job Details

Experience
3+ years in Software Engineering, with at least 1 year in SRE, Platform Engineering, or DevOps
Required Skills
KubernetesTypeScriptAzureGoGrafanaPrometheusCI/CDDevOpsTerraform

Requirements

  • 3+ years in Software Engineering, with at least 1 year in SRE, Platform Engineering, or DevOps
  • Experience with cloud infrastructure on Azure or comparable
  • Familiarity with regulated or compliance-oriented environments (FedRAMP, CMMC, PCI-DSS, HIPAA)
  • Ability to read and understand code to investigate system behavior
  • Experience with monitoring and observability tools (e.g., Prometheus, Grafana, OpenTelemetry, ELK stack)
  • Experience with IaC tools (Terraform, Terragrunt, or Pulumi)
  • Experience with container orchestration (Kubernetes)
  • Experience with CI/CD tooling (GitHub Actions, Azure DevOps, GitLab CI, or ArgoCD)
  • Strong programming skills in TypeScript/JS, Go, Java, C#, or similar
  • Solid understanding of distributed systems fundamentals and networking basics

Responsibilities

  • Get up to speed on VDC workloads, dependencies, and operational workflows
  • Write and maintain runbooks, incident guides, and operational documentation
  • Participate in incident response including triage, investigation, mitigation, and postmortems
  • Help implement and maintain SLIs, SLOs, and error budgets
  • Close monitoring gaps by implementing instrumentation, alerting, and dashboards
  • Contribute to toil reduction through automation and tooling improvements
  • Work with IaC, CI/CD pipelines, and deployment tooling in compliance-restricted environments
  • Work with engineering, security, compliance, and operations teams
View Full Description & ApplyYou'll be redirected to the employer's site
109,800 - 252,500 USD per year OTE
Apply Now