Apply

Director, Site Reliability Engineering

Posted 7 days agoViewed

View full description

💎 Seniority level: Director, 5+ years

📍 Location: United States, Canada

💸 Salary: 190000.0 - 300000.0 USD per year

🔍 Industry: Software Development

🏢 Company: Invoca👥 201-500💰 $83,000,000 Series F almost 3 years agoDigital MarketingArtificial Intelligence (AI)AdvertisingAnalyticsTelecommunications

🗣️ Languages: English

⏳ Experience: 5+ years

🪄 Skills: AWSDockerLeadershipCloud ComputingGCPKafkaKubernetesMySQLPeople ManagementGrafanaPostgresPrometheusCI/CDRESTful APIsLinuxDevOpsTerraformMicroservicesAnsibleScriptingSoftware EngineeringSaaS

Requirements:
  • 5+ years of hands-on experience in an SRE, DevOps, sysadmin, or infrastructure engineering role
  • Have strong opinions coupled with an open mind for infrastructure design, architecture, and automation based on organizational context, experience, and industry practices
  • Ability to use understanding of both established systems and general industry direction to help guide strategic decisions
  • Cloud computing fundamentals, particularly in AWS & GCP
  • Containerization, specifically Docker and Kubernetes via kops
  • Linux, especially Debian
  • Configuration management tooling, particularly Chef
  • Observability tooling, we use Prometheus, Grafana, Thanos, Karma, and ELK
  • Telephony with SIP, FreeSWITCH, and Kamailio
  • Other ownership areas include Kafka, Consul, MySQL
  • 3+ years of experience directly managing SRE, DevOps, sysadmin, or other infrastructure teams
Responsibilities:
  • Provide direct management to an SRE Tech Lead and a team of 8-10 direct reports across two teams
  • Build capabilities in your engineers to meet the requirements and competencies of their role
  • Organize the team around solving challenging problems presented by the team and the business
  • Draft, evolve, and communicate process, strategy, vision, and goals
  • Assist or own vendor management for infrastructure and platform tools
  • Apply a build/borrow/buy framework to technology decisions
  • Assist with compliance auditing activities for PCI, SOC, and ISO
  • Set standards and policies for infrastructure usage across the engineering org
  • Solicit feedback from internal customers on infrastructure challenges and opportunities
  • Organize and facilitate work in 2-week sprints, initiatives, epics, and stories
  • Own the post-incident work process for the team to improve following incidents in our service area
  • Administrative work and facilitation for the team
  • Participate in an incident commander on-call rotation approximately two days per month
Apply