Site Reliability Engineer

Posted 5 days agoViewed

PolandFull-TimeSoftware Development

Location:Poland

Languages:English

Seniority level:Senior, 2+ years

Experience:2+ years

Skills:

AWSNode.jsPostgreSQLPythonSoftware DevelopmentBashGCPKubernetesMongoDBMySQLAzureGrafanaLinuxDevOpsTerraformAnsible

Requirements:

2+ years as a Site Reliability Engineer or Software Developer Advanced experience with programming/scripting languages such as JavaScript/NodeJS, Python or Bash Knowledge in Linux monitoring, troubleshooting, and administration Competence in at least one programming language Can write and evaluate code for scalability/runtime Experience with container orchestration platforms such as Kubernetes or Nomad Experience with monitoring, APM, and logging tooling (Eg: ELK, Grafana, Datadog, NewRelic, or Splunk) Experience working with at least one DBMS (Eg: Postgres, MySQL, Oracle, or MongoDB) Experience with configuration management tools (Eg: Ansible, Puppet, Chef, or Salt). Ansible Tower or AWX is a plus Experience with Infrastructure-as-Code tools such as Terraform, Cloudformation, Google Deployment Manager, or Azure Resource Manager Experience working with at least one major Cloud Provider (AWS/Azure/GCP) Understanding of cloud native security requirements (Eg: WAF, security groups)

Responsibilities:

Combine software and systems engineering to build and run large-scale, distributed, fault-tolerant systems Solve operational problems with a software engineering mindset, treating operations as a software problem and automating away toil Define, measure, and monitor Service Level Objectives (SLOs) and Service Level Indicators (SLIs) in partnership with product and engineering teams Champion and help manage error budgets to guide the balance between reliability work and new feature velocity Participate as a partner in planning the product roadmap, sprint planning, stand-ups Scale systems sustainably through automation; evolve systems by pushing for changes that improves reliability and velocity Increase visibility into the health and durability of our platform Practice sustainable incident response and blameless postmortems Maintain existing services and tools, augmenting and replacing

Similar Jobs:

Posted about 6 hours ago

EuropeFull-TimeQuantum Computing, Cryptography

Research Engineer

Company:Project Eleven

Posted about 7 hours ago

PolandContractFintech

Staff Data Engineer (Analytics Engineering)

Posted about 7 hours ago

PolandFull-TimeSoftware Development

Staff Software Engineer, Back-end (Identity Engineering)

UK, EUFull-TimeSoftware DevelopmentPosted 7 hours ago

Quality Assurance Engineer

Company:The Dot Collective(11-50 employees, Cloud Computing, Analytics, Information Technology)

AWSPythonSQL+6 more