Senior Site Reliability Engineer, CCIP
New
C
Chainlink LabsBlockchain Infrastructure
Charlotte, Brazil, Canada, Phoenix, Argentina, Las Vegas, Tampa, Colombia, Mexico, Spain, Overlap some working hours with Eastern Standard Time (EST).Full-TimeSenior
Salary129,000 - 244,000 USD per year
Apply NowOpens the employer's application page
Job Details
- Required Skills
- KubernetesDistributed Systems
Requirements
- Demonstrated experience in Site Reliability Engineering, Production Engineering, or a similar role operating large-scale distributed systems.
- Deep expertise defining, implementing, and driving adoption of SLOs, SLIs, and error budgets across engineering organizations.
- Built and operated production Kubernetes environments supporting critical services.
- Applied OpenTelemetry to improve observability across distributed systems.
- Experience improving the reliability, scalability, and operability of production infrastructure.
- Demonstrated technical leadership influencing reliability practices across engineering teams (preferred).
- Experience performing capacity planning and performance tuning for high-throughput distributed services (preferred).
- Previous experience working on Web3 infrastructure or within a crypto-native engineering organization (preferred).
- Applied chaos engineering or fault-injection techniques to improve production resilience (preferred).
- Partnered with software engineering teams to conduct production-readiness reviews before service launches (preferred).
- Experience leading on-call operations, including defining rotations, escalation policies, and improving alert quality (preferred).
Responsibilities
- Improve deployment safety and increase delivery velocity by advancing production engineering practices.
- Establish distributed tracing across the platform to improve observability and accelerate incident investigation.
- Eliminate operational toil through automation that increases engineering efficiency and platform reliability.
- Drive adoption of meaningful SLOs, SLIs, and error budgets that guide engineering decisions and improve service health.
- Increase platform scalability and operational readiness as CCIP continues to grow.
- Strengthen Chainlink's reputation through highly available production systems while reducing operational overhead.
View Full Description & ApplyYou'll be redirected to the employer's site