Apply📍 Germany, Spain
- Love learning new technologies and thrive in solving complex challenges.
- Ability to work effectively across teams.
- Strong understanding of software development principles, systems architecture, cloud infrastructure, and monitoring technologies.
- Knowledge of fundamentals of network, databases, system administration, Version Control, CI/CD automation.
- Proficiency in one or more programming languages (preferably Java/Kotlin, Go, Python).
- Proven experience in leading and mentoring engineering teams.
- Strong analytical skills and the ability to troubleshoot complex systems, with excellent verbal and written communication skills.
- Develop and execute a strategic roadmap for the SRE team.
- Ensure proper team focus on priorities, milestones, and deliverables.
- Implement and drive SRE best practices and lead your team in daily agile devops practices.
- Manage the on-call team and the incident process including the post-mortem aftermath.
- Identify areas for improvement and propose solutions that align with business goals.
- Identify and assess risks to production systems and work to mitigate them.
- Optimize resource allocation and usage for operational and cost efficiency.
- Lead, mentor, and develop a team of SRE to ensure the reliability, scalability, and performance of our production system.
- Promote continuous learning and knowledge sharing.
- Work closely with development, product, and other engineering teams to ensure reliability is prioritized in the development lifecycle.
- Communicate effectively with stakeholders regarding reliability metrics and measures, post-mortems, and incident action items.
AWSDockerPythonSoftware DevelopmentAgileGCPJavaKotlinKubernetesGoGrafanaPrometheusCommunication SkillsAnalytical SkillsCI/CDMentoringLinuxDevOpsWritten communication
Posted about 1 month ago
Apply