Software Architect - Reliability Engineering
New
This role will be remote, and based in the East Coast, USA; or remote in Ireland, UK or Spain.Full-TimeSenior
Salary$228k-$335k per year
Apply NowOpens the employer's application page
Job Details
- Experience
- 15+ years
- Required Skills
- AWSPythonKubernetesGoDevOpsTerraformSoftware EngineeringDistributed Systems
Requirements
- 15+ years of experience in Reliability Engineering, Software Engineering, or DevOps.
- Strong experience in driving strategic technical decisions and defining long-term technical vision.
- In-depth understanding of Reliability Engineering in a large SaaS organization.
- Experience driving cross-org technical architecture outcomes.
- Knowledge of cloud architecture, DevOps practices, and microservices design.
- Bachelor's or Master's degree in Computer Science, Engineering, or equivalent experience.
- Strong production experience with scaling and tuning performance in high-scale environments.
- Hands-on experience with Kubernetes (e.g., EKS) and AWS.
- Proficiency in infrastructure-as-code tools such as Terraform or CloudFormation.
- Expertise in observability tools (e.g., Prometheus, Grafana, Datadog).
- Proficient in at least one programming language (e.g., Go, Python, Java).
- Experience with incident response, SLOs/SLIs, and on-call rotations.
Responsibilities
- Partner with senior technical leaders to set and communicate the reliability strategy.
- Influence company-wide architectural decisions balancing long-term vision with near-term needs.
- Lead the design, implementation, and operation of scalable solutions for high-traffic services.
- Design fault-tolerant architectures, incident response, disaster recovery, and capacity management.
- Collaborate with cross-functional teams to identify and address reliability risks.
- Mentor and grow engineers and technical leaders.
- Track and apply emerging SRE and cloud best practices to drive systemic improvements.
View Full Description & ApplyYou'll be redirected to the employer's site