Senior Network Site Reliability Engineer

New
Spain; Flexible remote work options across Europe.Full-TimeSenior
Salary not disclosed
Apply NowOpens the employer's application page

Job Details

Required Skills
PythonGoCI/CDLinux

Requirements

  • Strong experience in Site Reliability Engineering, Network Engineering, or Infrastructure Engineering roles within large-scale production environments.
  • Solid Linux systems administration expertise and proven ability to troubleshoot complex distributed systems.
  • Strong understanding of networking fundamentals, including failure domains, latency, packet loss, control plane/data plane concepts, and high-availability architectures.
  • Hands-on experience operating and improving reliable production systems through automation and engineering best practices.
  • Proficiency in software development or scripting using Go, Python, or similar programming languages.
  • Experience with infrastructure-as-code, CI/CD pipelines, containerized environments, and operational automation tools.
  • Familiarity with observability, telemetry, monitoring systems, and incident management practices.
  • Ability to work collaboratively across engineering teams while maintaining strong ownership and communication skills.

Responsibilities

  • Define and manage reliability objectives for critical network services, including SLIs, SLOs, availability targets, and operational performance standards.
  • Lead initiatives to improve overall network reliability across infrastructure, inter-site connectivity, and operational workflows.
  • Own incident response processes for networking environments, conduct root cause investigations, and implement long-term corrective solutions.
  • Design and enhance observability systems through metrics, logging, tracing, alerting, and monitoring improvements to accelerate troubleshooting and recovery.
  • Build and maintain automation, CI/CD pipelines, testing environments, rollback mechanisms, and safe deployment processes for network changes.
  • Collaborate with platform engineering and infrastructure teams to improve operability, scalability, and reliability of networking systems.
  • Develop tooling and automation solutions using modern programming languages and infrastructure management practices.
  • Support operational readiness and scalability initiatives for high-availability and high-throughput networking environments.
View Full Description & ApplyYou'll be redirected to the employer's site
View details
Apply Now