DevOps Team Lead, Core Foundation

New
A
Alpaca Financial Services
Remote - EMEAFull-TimeLead
Salary not disclosed
Apply NowOpens the employer's application page

Job Details

Required Skills
PostgreSQLKubernetesGrafanaPrometheusTerraform

Requirements

  • Proven experience as an Engineering Manager, DevOps Lead, or Site Reliability Engineering Lead, with a track record of successfully managing globally distributed teams.
  • Exceptional people management skills, with a deep focus on coaching, mentoring, and fostering team culture across multiple time zones.
  • Deep expertise in engineering support frameworks, roadmap planning, and team prioritization methodologies.
  • Proven experience owning Change Management lifecycles.
  • Proven ability to break down organizational silos, build trust between disparate teams, and shepherd complex systemic updates from conception to deployment.
  • Extensive experience managing Incident Management lifecycles and running sustainable, global on-call rotations.
  • Incredibly strong communication and organizational skills, with a proven ability to drive and coordinate complex, multi-stage tech rollouts and deployments.
  • Solid technical background in modern DevOps/SRE ecosystems.
  • Understanding of Kubernetes (GKE).
  • Understanding of Infrastructure as Code (Terraform).
  • Understanding of Relational Databases (PostgreSQL).
  • Understanding of Observability stacks (Prometheus, Grafana, Thanos).
  • Strategic mindset capable of navigating shifting priorities.

Responsibilities

  • Lead, mentor, and foster a healthy, high-performing globally distributed engineering team.
  • Own the execution and delivery of highly critical, complex yearly roadmap items centered around large-scale foundational infrastructure upgrades, high availability, and platform resilience.
  • Own and drive the change management processes across engineering and product domains, orchestrating smooth delivery of major systemic changes.
  • Design, implement, and refine robust support workflows, agile planning methodologies, and deployment/rollout strategies to ensure operational excellence.
  • Manage and optimize the global on-call rotation to ensure team well-being while maintaining high availability.
  • Lead incident response (via Rootly), establishing clear communication, rapid resolution processes, and blameless post-mortems.
View Full Description & ApplyYou'll be redirected to the employer's site
View details
Apply Now