VP of Site Reliability

New
T
Titan AIFintech AI
United States; Atlanta, GA strongly preferred. West Coast considered on a case-by-case basis. Remote-friendly for the right candidate.Full-TimeVp
Salary not disclosed
Apply NowOpens the employer's application page

Job Details

Experience
Ten or more years in engineering, with at least five years personally building SRE or platform operations functions.

Requirements

  • 10+ years in engineering.
  • 5+ years building SRE or platform operations at a company selling to enterprise/regulated markets.
  • Experience managing multi-tenant and multi-deployment-model infrastructure.
  • Proven experience building on-call rotations from scratch.
  • Experience writing and implementing SLOs.
  • Experience as technical owner during production incidents.
  • Ability to influence engineering teams through process without direct reporting lines.

Responsibilities

  • Build and personally operate the SRE practice (SLO framework, on-call rotation, incident command).
  • Lead incident response at live bank customers and produce postmortems.
  • Define severity tiers, SLA commitments, and escalation paths for production support.
  • Set the operating system across engineering lanes (sprint discipline, release rituals, code review).
  • Manage SOC 2 artifacts, model risk review documentation, and change traceability.
  • Scale deployment playbooks for growing customer base across Azure, private cloud, and bank infrastructure.
View Full Description & ApplyYou'll be redirected to the employer's site
View details
Apply Now