Senior Site Reliability Engineer - Payward Services
New
K
KrakenCrypto
United States, EU timezones, EU timezonesFull-TimeSenior
Salary not disclosed
Apply NowOpens the employer's application page
Job Details
- Experience
- 5+ years
- Required Skills
- DockerPythonBashGitKubernetesGoGrafanaPrometheusCI/CDLinux
Requirements
- 5+ years in DevOps or SRE role
- Proficiency with hybrid-cloud infrastructure environments
- Git source version-control and CI/CD configuration proficiency
- Deep understanding of monitoring and alerting systems, preferably Prometheus and Grafana
- Ability to debug complex distributed systems, networks, and Linux operating systems issues
- Containerization and orchestration experience (Docker, Nomad, Kubernetes a plus)
- Strong scripting skills (Bash, Python, or Go)
- Self-starter capable of thriving independently and remotely in fast-paced environments
Responsibilities
- Manage and support infrastructure for Payward Services, including Nomad, Kubernetes, databases, and 3rd party system integration
- Provide operational support across multiple teams, helping debug issues in staging and production environments
- Participate in incident response and post-incident reviews to improve system resilience
- Consult with teams on performance, monitoring, and alerting best practices — with awareness of partner-facing SLA commitments
- Build tooling, automation, and dashboards to improve observability and empower development teams
- Maintain and troubleshoot CI pipelines, ensuring reliable and fast build, test, and deployment cycles
- Collaborate with developers, QA, and product managers to streamline development and release cycles
- Support a fully distributed team operating across multiple timezones
View Full Description & ApplyYou'll be redirected to the employer's site