Site Reliability Engineer

Posted 3 months agoViewed

CanadaFull-TimeSupply Chain Solutions

Company:Tecsys Inc.

Location:Canada

Languages:English

Seniority level:Staff, 5+ years

Experience:5+ years

Skills:

AWSPythonAWS EKSBashCloud ComputingJavaJenkinsKubernetesCI/CDDevOpsTerraformAnsibleSaaS

Requirements:

5+ years in Site Reliability, Cloud, or DevOps Engineering Experience designing and deploying large scale systems, multi-vendor platforms and globally distributed infrastructure Proven experience managing cloud infrastructure in AWS (multi-account, VPC, EC2, EKS) and Kubernetes at scale Strong hands-on experience with IaC and automation (Terraform, Ansible, or similar) Familiarity with CI/CD pipelines and release automation (GitLab preferred, Jenkins acceptable) Deep understanding of monitoring and observability using Datadog (or equivalent) Experience with incident management, on-call participation, escalation, and structured postmortems Scripting skills in Python, Bash, Java or equivalent Basic knowledge of Java- or .Net-based development Strong English communication skills

Responsibilities:

Collaborate with Engineering teams on system design, platforms, capacity planning, and launch reviews. Innovate to simplify, scale, and strengthen the platform. Maintain services by monitoring availability, latency, and system health. Own observability by enhancing monitoring and alerting using Datadog. Drive automation for tooling, IaC, and pipelines. Scale systems sustainably and evolve them for reliability and velocity. Participate in on-call rotation. Practice sustainable incident response and blameless postmortems. Lead post-incident reviews (RCAs) and identify long-term fixes. Implement monitoring, logging, alerting, and SLA reporting. Create and maintain technical documentation. Implement and mature SRE best practices. Act as Incident Commander for incidents. Provide support for planning and deployment teams. Collaborate with Platform Engineering team on strategic efforts. Work cross-functionally with internal teams and vendors.