Site Reliability Engineer
New
T
TobogganLabsHealthtech AI
We prefer to hire in Quebec, but we are open to candidates anywhere in the EST±2 time zone in Canada., EST ±2Full-TimeMiddle
Salary not disclosed
Apply NowOpens the employer's application page
Job Details
- Experience
- 5+ years
- Required Skills
- AWSAzureGrafanaPrometheusCI/CDTerraformGitHub ActionsDatadogCloudFormation
Requirements
- 5+ years of experience in infrastructure, DevOps, or site reliability engineering
- Hands-on experience with AWS or Azure infrastructure
- Experience with infrastructure-as-code tools such as Terraform or CloudFormation
- Strong experience with CI/CD pipelines such as GitHub Actions, ArgoCD, or Jenkins
- Experience with observability tools such as Prometheus, Grafana, Datadog, or CloudWatch
- Familiarity with cloud security best practices (network security, IAM, encryption, vulnerability management)
- Excellent communication skills
- Ability to explain infrastructure and reliability trade-offs to stakeholders
Responsibilities
- Design and maintain resilient, secure cloud infrastructure using infrastructure-as-code
- Implement security controls, hardening standards, and compliance guardrails
- Design and implement monitoring, alerting, and logging systems
- Lead incident response and post-mortem processes
- Define and track SLOs and SLIs
- Automate deployment pipelines and infrastructure provisioning
- Provide technical leadership on reliability and infrastructure workstreams
- Mentor team members and contribute to internal tooling
View Full Description & ApplyYou'll be redirected to the employer's site