Senior Site Reliability Engineer

New
US States: Arizona, California, Colorado, Connecticut, District of Columbia*, Florida, Georgia, Idaho, Illinois, Indiana, Iowa, Maryland, Massachusetts, Michigan, Minnesota, Missouri, New Jersey, New Mexico, New York, North Carolina, Ohio, Oklahoma, Oregon, Pennsylvania, Puerto Rico*, Rhode Island, Tennessee, Texas, Utah, Vermont, Virginia, Washington, West Virginia, Wisconsin and Wyoming. Countries: Brazil, Canada, Colombia, Germany, Ghana, India, Indonesia, Italy, Kenya*, Mexico, Morocco, Netherlands, Poland, Singapore*, South Africa, Spain, Switzerland and the United Kingdom., Ability to work across multiple time zones.Full-TimeSenior
Salary113,082 - 175,725 USD per year
Apply NowOpens the employer's application page

Job Details

Languages
Strong English language skills (verbal and written).
Experience
6+ years experience in an SRE/Operations/DevOps role as part of a team
Required Skills
PythonKubernetesLinuxDevOpsDistributed Systems

Requirements

  • 6+ years of experience in an SRE, Operations, or DevOps role.
  • Proficiency with shell and a scripting language (Python, Go, Bash, or Ruby).
  • Experience with configuration management tools such as Puppet or Ansible.
  • Experience with distributed caching systems and performance optimization.
  • Experience with package management on Linux systems (specifically Debian).
  • Strong Linux system-level troubleshooting skills.
  • Proven track record of automating tasks, identifying process gaps, and implementing improvements.
  • Strong English language skills (verbal and written).
  • Ability to work independently in a globally distributed team across multiple time zones.
  • Experience leading incident response and post-incident review rituals for root cause analysis.
  • Willingness to travel 1-2 times per year for in-person events and team meetings.

Responsibilities

  • Perform day-to-day operational/DevOps tasks on public-facing infrastructure, including deployment, maintenance, configuration, and troubleshooting.
  • Implement and utilize configuration management and deployment tools such as Puppet and Kubernetes.
  • Lead continuous improvement initiatives by automating the installation, configuration, and maintenance of services.
  • Assist product teams with architectural design to ensure new services operate at scale.
  • Participate in a 24/7 on-call rotation for incident response, diagnosis, and follow-up on system outages.
  • Collaborate with a global, cross-functional team in an asynchronous environment.
  • Mentor peers in technical and operational areas.
View Full Description & ApplyYou'll be redirected to the employer's site
113,082 - 175,725 USD per year
Apply Now