Site Reliability Engineer

New
T
TobogganLabsHealthtech AI
We prefer to hire in Quebec, but we are open to candidates anywhere in the EST±2 time zone in Canada., EST ±2Full-TimeMiddle
Salary not disclosed
Apply NowOpens the employer's application page

Job Details

Experience
5+ years
Required Skills
AWSAzureGrafanaPrometheusCI/CDTerraformGitHub ActionsDatadogCloudFormation

Requirements

  • 5+ years of experience in infrastructure, DevOps, or site reliability engineering
  • Hands-on experience with AWS or Azure infrastructure
  • Experience with infrastructure-as-code tools such as Terraform or CloudFormation
  • Strong experience with CI/CD pipelines such as GitHub Actions, ArgoCD, or Jenkins
  • Experience with observability tools such as Prometheus, Grafana, Datadog, or CloudWatch
  • Familiarity with cloud security best practices (network security, IAM, encryption, vulnerability management)
  • Excellent communication skills
  • Ability to explain infrastructure and reliability trade-offs to stakeholders

Responsibilities

  • Design and maintain resilient, secure cloud infrastructure using infrastructure-as-code
  • Implement security controls, hardening standards, and compliance guardrails
  • Design and implement monitoring, alerting, and logging systems
  • Lead incident response and post-mortem processes
  • Define and track SLOs and SLIs
  • Automate deployment pipelines and infrastructure provisioning
  • Provide technical leadership on reliability and infrastructure workstreams
  • Mentor team members and contribute to internal tooling
View Full Description & ApplyYou'll be redirected to the employer's site
View details
Apply Now