Senior Site Reliability Engineer

New

US States: Arizona, California, Colorado, Connecticut, District of Columbia*, Florida, Georgia, Idaho, Illinois, Indiana, Iowa, Maryland, Massachusetts, Michigan, Minnesota, Missouri, New Jersey, New Mexico, New York, North Carolina, Ohio, Oklahoma, Oregon, Pennsylvania, Puerto Rico*, Rhode Island, Tennessee, Texas, Utah, Vermont, Virginia, Washington, West Virginia, Wisconsin and Wyoming. Countries: Brazil, Canada, Colombia, Germany, Ghana, India, Indonesia, Italy, Kenya*, Mexico, Morocco, Netherlands, Poland, Singapore*, South Africa, Spain, Switzerland and the United Kingdom., Ability to work across multiple time zones.Full-TimeSenior

Salary113,082 - 175,725 USD per year

Apply NowOpens the employer's application page

Job Details

Languages: Strong English language skills (verbal and written).
Experience: 6+ years experience in an SRE/Operations/DevOps role as part of a team
Required Skills: PythonKubernetesLinuxDevOpsDistributed Systems

Requirements

6+ years of experience in an SRE, Operations, or DevOps role.
Proficiency with shell and a scripting language (Python, Go, Bash, or Ruby).
Experience with configuration management tools such as Puppet or Ansible.
Experience with distributed caching systems and performance optimization.
Experience with package management on Linux systems (specifically Debian).
Strong Linux system-level troubleshooting skills.
Proven track record of automating tasks, identifying process gaps, and implementing improvements.
Strong English language skills (verbal and written).
Ability to work independently in a globally distributed team across multiple time zones.
Experience leading incident response and post-incident review rituals for root cause analysis.
Willingness to travel 1-2 times per year for in-person events and team meetings.

Responsibilities

Perform day-to-day operational/DevOps tasks on public-facing infrastructure, including deployment, maintenance, configuration, and troubleshooting.
Implement and utilize configuration management and deployment tools such as Puppet and Kubernetes.
Lead continuous improvement initiatives by automating the installation, configuration, and maintenance of services.
Assist product teams with architectural design to ensure new services operate at scale.
Participate in a 24/7 on-call rotation for incident response, diagnosis, and follow-up on system outages.
Collaborate with a global, cross-functional team in an asynchronous environment.
Mentor peers in technical and operational areas.

View Full Description & ApplyYou'll be redirected to the employer's site

Similar Jobs