- Define and maintain SLOs, SLIs, and error budgets, plus the observability stack to catch regressions.
- Build repeatable, self-service infrastructure using infrastructure-as-code and CI/CD pipelines.
- Own end-to-end rollouts including progressive delivery, canary deployments, and safe migrations.
- Operate the infrastructure for nodes, validators, RPC, and indexing services, optimizing for performance and cost.
- Lead incident response and on-call rotations, facilitating blameless postmortems.
- Partner with product and protocol teams to design and operate production-ready services.
DockerKubernetesTypeScript+5 more