Principal Customer Reliability Engineer
New
A
AccelaGovernment Software
Remote Based - USFull-TimePrincipal
Salary$160,000-$190,000
Apply NowOpens the employer's application page
Job Details
- Experience
- 8+ years
- Required Skills
- PythonBashGitKubernetesMicrosoft AzureTerraform
Requirements
- 8+ years of experience in Production Engineering, Site Reliability Engineering, Cloud Operations, Technical Support Engineering, or related SaaS environments.
- Strong customer focus and demonstrated ability to communicate effectively with both technical and business stakeholders.
- Hands-on experience operating and supporting SaaS platforms on Microsoft Azure.
- Experience with Kubernetes and modern containerized environments.
- Strong experience using observability and monitoring tools, including APM platforms, distributed tracing, logging, and metrics solutions.
- Deep troubleshooting and Root Cause Analysis expertise across application, infrastructure, networking, operating system, and database layers.
- Working knowledge of Infrastructure-as-Code concepts and tools, particularly Terraform.
- Experience developing automation and operational tooling using Python, PowerShell, Bash, or similar scripting languages.
- Demonstrated ability to lead Incident, Problem, and Change Management processes during high-severity customer escalations.
- Excellent written and verbal communication skills, including experience presenting technical information to customer leadership and executive stakeholders.
- Experience using Git and GitHub-based workflows.
Responsibilities
- Serve as the customer-facing technical representative for Accela's SaaS Operations organization, partnering with Engineering, SRE, Database Engineering, Product, Professional Services, and Support teams to ensure customer success.
- Lead technical engagements related to SaaS implementations, migrations, and ongoing production operations, ensuring reliable and predictable outcomes for customers.
- Develop and improve operational processes, tooling, monitoring, metrics, and alerting capabilities that support customer onboarding, migrations, and ongoing platform reliability.
- Act as a senior escalation point for complex customer issues, leveraging observability tools, application performance monitoring, log analysis, distributed tracing, and metrics to diagnose and resolve production concerns.
- Lead cross-functional response efforts for critical customer-impacting incidents and implementation challenges, coordinating stakeholders to drive timely resolution.
- Partner with customers to establish reliability expectations, communicate service-level commitments, and provide guidance regarding operational risk and change management practices.
- Collaborate with Sales, Professional Services, and Support teams to communicate Accela's service management processes, cloud operations practices, and compliance posture.
- Provide technical leadership, mentorship, and best-practice guidance to Customer Reliability Engineers, Site Reliability Engineers, and other technical teams.
View Full Description & ApplyYou'll be redirected to the employer's site