Sr./Staff Infrastructure/Site Reliability Engineer
New
O
OscilarAI Fintech
Remote, Remote - CanadaFull-TimeSenior
Salary not disclosed
Apply NowOpens the employer's application page
Job Details
- Required Skills
- AWSPythonKafkaKubernetesClickhouseGoCI/CDTerraformMicroservices
Requirements
- Proven track record as a senior SRE or Infrastructure Engineer in high-scale environments
- Expert-level skills in AWS
- Expert-level skills in Infrastructure as Code (Pulumi, Terraform)
- Strong programming ability in Go or Python
- Deep understanding of distributed systems (Kafka, ClickHouse) and microservices architecture
- Mastery of container orchestration (Kubernetes) and production debugging
- Strong sense of ownership, and the judgment to balance velocity with reliability
Responsibilities
- Architect and operate resilient cloud infrastructure (AWS, Pulumi, Kubernetes)
- Lead initiatives to improve availability, latency, and performance at scale
- Design and evolve our CI/CD pipelines to optimize for speed, safety, and repeatability
- Define the metrics, alerts, and runbooks that form our observability backbone
- Run chaos experiments and failure simulations to harden the platform
- Mentor engineers and set best practices for SRE across the company
View Full Description & ApplyYou'll be redirected to the employer's site