Principal Customer Reliability Engineer

New
A
AccelaGovernment Software
Remote Based - USFull-TimePrincipal
Salary$160,000-$190,000
Apply NowOpens the employer's application page

Job Details

Experience
8+ years
Required Skills
PythonBashGitKubernetesMicrosoft AzureTerraform

Requirements

  • 8+ years of experience in Production Engineering, Site Reliability Engineering, Cloud Operations, Technical Support Engineering, or related SaaS environments.
  • Strong customer focus and demonstrated ability to communicate effectively with both technical and business stakeholders.
  • Hands-on experience operating and supporting SaaS platforms on Microsoft Azure.
  • Experience with Kubernetes and modern containerized environments.
  • Strong experience using observability and monitoring tools, including APM platforms, distributed tracing, logging, and metrics solutions.
  • Deep troubleshooting and Root Cause Analysis expertise across application, infrastructure, networking, operating system, and database layers.
  • Working knowledge of Infrastructure-as-Code concepts and tools, particularly Terraform.
  • Experience developing automation and operational tooling using Python, PowerShell, Bash, or similar scripting languages.
  • Demonstrated ability to lead Incident, Problem, and Change Management processes during high-severity customer escalations.
  • Excellent written and verbal communication skills, including experience presenting technical information to customer leadership and executive stakeholders.
  • Experience using Git and GitHub-based workflows.

Responsibilities

  • Serve as the customer-facing technical representative for Accela's SaaS Operations organization, partnering with Engineering, SRE, Database Engineering, Product, Professional Services, and Support teams to ensure customer success.
  • Lead technical engagements related to SaaS implementations, migrations, and ongoing production operations, ensuring reliable and predictable outcomes for customers.
  • Develop and improve operational processes, tooling, monitoring, metrics, and alerting capabilities that support customer onboarding, migrations, and ongoing platform reliability.
  • Act as a senior escalation point for complex customer issues, leveraging observability tools, application performance monitoring, log analysis, distributed tracing, and metrics to diagnose and resolve production concerns.
  • Lead cross-functional response efforts for critical customer-impacting incidents and implementation challenges, coordinating stakeholders to drive timely resolution.
  • Partner with customers to establish reliability expectations, communicate service-level commitments, and provide guidance regarding operational risk and change management practices.
  • Collaborate with Sales, Professional Services, and Support teams to communicate Accela's service management processes, cloud operations practices, and compliance posture.
  • Provide technical leadership, mentorship, and best-practice guidance to Customer Reliability Engineers, Site Reliability Engineers, and other technical teams.
View Full Description & ApplyYou'll be redirected to the employer's site
$160,000-$190,000
Apply Now