Senior Site Reliability Engineer

Posted 7 months agoInactiveViewed
190000.0 - 220000.0 USD per year
United States, CanadaFull-TimeSoftware Development
Company:
Location:United States, Canada
Languages:English
Seniority level:Senior, 5+ years
Experience:5+ years
Skills:
AWSDockerNode.jsPostgreSQLPythonBashKafkaMongoDBReact NativeTypeScriptVue.JsNest.jsReactCI/CDLinuxDevOpsTerraformJSON
Requirements:
5+ years running production workloads on AWS (or GCP/Azure) with infrastructure-as-code (Terraform/CDK/CloudFormation) Hands-on experience operating container orchestration (ECS, EKS, Kubernetes, Nomad, etc.) and designing blue/green or canary rollouts Depth in at least two of our core datastores (Postgres, MongoDB, Kafka) including backup/restore, upgrades, and performance tuning Fluency with CI/CD pipelines (we use Buildkite + GitHub Actions) and a knack for automating everything with shell, Python, or TypeScript Proven track record setting up monitoring/alerting in Datadog, Prometheus, or similar, with clear SLO/SLA ownership Strong grasp of linux networking, load balancing (Cloudflare/ELB), and CDN/edge-security concepts Excellent incident-management and root-cause analysis skills; able to write crisp RCAs and follow through on action items Passion for customer-centric thinking, rapid iteration, and continuous learning
Responsibilities:
Set SLOs/SLIs, build self-healing architectures, and drive incident-prevention projects that keep our APIs and real-time ordering flows <100 ms p95. Level-up dashboards, alerts, and distributed tracing so teams can detect issues before customers do. Evolve our Buildkite pipelines and Terraform modules to give engineers <10-minute, one-click rollouts (and clean rollbacks). Harden infra with least-privilege IAM, threat-model topology changes, and guide SOC 2 / PCI efforts. Tune Postgres for multi-TB workloads, maintain Mongo sharding, and shepherd Kafka topic management as event volume climbs. Rotate with the on-call SREs, run blameless post-mortems, and convert findings into durable fixes. Pair with product engineers on capacity reviews, guide junior devs on Docker best-practices, and evangelize “you build it, you run it.”
Similar Jobs:
Posted 17 minutes ago
WorldwideFull-TimeCrypto Trading
Senior Full Stack Engineer - Crypto Trading
Posted 20 minutes ago
Continental USFull-TimeFinancial Services
Senior Software Engineer (IC3)
Posted 20 minutes ago
Continental USFull-TimeFinancial Services
Senior Software Engineer (IC4)