Senior Site Reliability Engineer

New
P
PragmatikeCloud Computing
Armenia; Latvia, Spain, Albania, Bosnia & Herzegovina, Estonia, Poland, Portugal, Italy, Romania, CET ±2hFull-TimeSenior
Salary not disclosed
Apply NowOpens the employer's application page

Job Details

Languages
Fluent English
Required Skills
PythonBashKubernetesGrafanaPrometheusLinuxNetworkingAnsible

Requirements

  • Expert-level, hands-on experience operating Kubernetes in production environments.
  • Strong network engineering skills (VLANs, L2/L3 routing, VPNs, multi-site connectivity).
  • Strong proficiency with Linux systems administration (Debian/Ubuntu).
  • Experience building and maintaining automation workflows (Ansible, Bash/Python, Git-based).
  • Experience with observability stacks such as Prometheus, Grafana, ELK, Loki, or Graylog.
  • Background with virtualization technologies (OpenStack, Proxmox, VMware).
  • Experience with bare-metal provisioning and MAAS (Metal as a Service).
  • Strong understanding of distributed systems and container orchestration.
  • Process-oriented mindset with ability to develop SOPs and operational procedures from scratch.
  • Experience with incident response, escalation procedures, and on-call rotations.
  • Ability to work autonomously in a fast-paced, engineering-driven environment.

Responsibilities

  • Operate and maintain Linux-based infrastructure (Debian/Ubuntu).
  • Deploy, manage, and scale Kubernetes clusters across bare-metal, virtualized, and on-prem environments.
  • Implement automation for provisioning and operations using Ansible, Bash/Python, and GitOps workflows.
  • Design and maintain networking architecture including VLANs, L2/L3 routing, VPNs, and multi-site connectivity.
  • Deploy and maintain observability stacks (Prometheus/Grafana, Loki, ELK, Graylog).
  • Lead incident response and escalation activities across the platform.
  • Define and implement SLOs/SLIs at physical network, platform virtualization, and software service levels.
  • Develop Standard Operating Procedures (SOPs) for repeatable operations and maintenance tasks.
View Full Description & ApplyYou'll be redirected to the employer's site
View details
Apply Now