Site Reliability Engineer
New
P
PaddlePayment Infrastructure
UK, Portugal, IrelandFull-Time
Salary not disclosed
Apply NowOpens the employer's application page
Job Details
- Required Skills
- AWSDockerPHPPostgreSQLMySQLGoLinuxTerraform
Requirements
- Software development background with experience shipping and operating production services.
- Strong fundamentals in testing, code review, CI/CD, and debugging.
- Professional experience working across the AWS ecosystem.
- Knowledge of platform and operations concepts, specifically networking and Linux administration.
- Experience with microservices and distributed systems at scale.
- Familiarity with monitoring and observability tools (e.g., OpenTelemetry, Honeycomb, Grafana).
- Curiosity about AI and its role in software development.
- Collaborative, security-minded, and detail-oriented approach.
- Ability to thrive in a fast-paced environment.
Responsibilities
- Develop and maintain tools to maximize engineering efficiency, such as automating deployment infrastructure and database upgrades.
- Collaborate with internal teams to improve processes with automation, prioritizing Developer Experience.
- Create, maintain, and test system disaster recovery processes, including automated tooling.
- Handle production incidents, author blameless postmortems, and maintain operational runbooks.
- Manage monitoring, alerting, and SLO tracking.
- Perform load testing and bottleneck analysis to drive performance tuning across applications and AWS services.
- Execute cost optimization initiatives including right-sizing, autoscaling, and storage tiering.
- Advocate and implement GitOps methodologies.
View Full Description & ApplyYou'll be redirected to the employer's site