Site Reliability Engineer

New
Working remote from BC, Pacific time zoneFull-TimeMiddle
Salary80,000 - 100,000 CAD per year
Apply NowOpens the employer's application page

Job Details

Experience
3+ years
Required Skills
DockerPythonBashGitKubernetesPrometheusLinuxTerraformAnsible

Requirements

  • 3+ years of software and/or operational experience in building and maintaining internet-facing production environments.
  • Strong experience with Linux/Unix systems administration.
  • Strong scripting abilities in Bash and Python.
  • Knowledge of source control tools (Git preferred).
  • Experience with Configuration Management and Infrastructure as Code tools (Ansible, Puppet, Terraform preferred).
  • Good understanding of container technology (Docker, Kubernetes preferred).
  • Experience with monitoring tools (Prometheus, Grafana, Nagios, or similar) and alerting systems.
  • Experience running a large-scale 24/7 production environment.
  • Experience with incident management, troubleshooting, and root cause analysis.
  • Bachelor's degree in information systems, computer science, technology, or a related field preferred; 2+ years of relevant experience accepted in lieu of a degree.

Responsibilities

  • Ensure the reliability of critical products and services by meeting or exceeding SRE objectives.
  • Instantiate and maintain production infrastructure using Infrastructure as Code and Configuration Management tools.
  • Build and maintain proper monitoring of services by utilizing centralized logging and time series databases.
  • Automate deployments, administration, and monitoring by following CI/CD practices.
  • Work with engineering and information security teams to improve the operability and security of services.
  • Participate in team on-call rotation.
View Full Description & ApplyYou'll be redirected to the employer's site
80,000 - 100,000 CAD per year
Apply Now