Apply

Senior Site Reliability Engineer - Platform

Posted 2024-11-07

View full description

💎 Seniority level: Senior, 5+ years

📍 Location: USA

🔍 Industry: Cryptocurrency

🏢 Company: Referrals Only Board

🗣️ Languages: English

⏳ Experience: 5+ years

🪄 Skills: DockerPythonBlockchainEthereumJavascriptKubernetesRubyAlgorithmsData StructuresGolangCommunication SkillsJavaScriptLinuxTerraform

Requirements:
  • At least 5+ years of software engineering experience.
  • Strong understanding of data structures and algorithms related to performance and reliability.
  • Fluency in at least one programming language such as Golang, Ruby, Python, or JavaScript.
  • Strong skills around observability, debugging, and performance tuning.
  • Ability to debug complex systems and willingness to understand and improve any layer of the stack.
  • Experience with container orchestration systems (Docker, ECS, EKS) and monitoring tools (DataDog, Graphite, Grafana, Prometheus).
  • Deep knowledge of UNIX/Linux system internals including system calls, TCP/IP, and debugging tools.
  • Strong communication skills and ability to explain technical concepts clearly.
  • Demonstrated critical thinking under pressure.
Responsibilities:
  • Build automation and improve systems to eliminate toil and operations work.
  • Improve observability, reliability, and availability by defining and measuring key metrics.
  • Collaborate with the core infrastructure team to performance tune and optimize cloud deployments.
  • Collaborate with product teams to reduce service disruptions and automate incident response.
  • Proactively find and analyze reliability problems and design software for improvements.
  • Facilitate incident response, conduct root cause analysis, and blameless retrospectives.
  • Educate and mentor the engineering team to enhance system reliability and promote reliability as a core value.
Apply