Senior Site Reliability Engineer

New

United StatesFull-TimeSenior

Salary not disclosed

Apply NowOpens the employer's application page

Job Details

Experience: 8+ years of hands-on technical experience in software engineering, infrastructure, or operations roles, including a minimum of 4 years dedicated to Site Reliability Engineering (SRE).
Required Skills: AWSPythonBashKubernetesCI/CD

8+ years of hands-on technical experience in software engineering, infrastructure, or operations roles.
Minimum of 4 years dedicated to Site Reliability Engineering (SRE).
Strong proficiency in Python, Bash, PowerShell.
Expert-level experience designing, building, and maintaining autonomous systems.
Proficient hands-on experience with AWS (e.g., EC2, Kubernetes/EKS, CloudWatch, Lambda, S3, IAM).
Proficiency in monitoring/alerting, incident response, capacity planning, and performance optimization.
Bachelor’s degree in Computer Science, Information Systems, or a related field; or equivalent certifications/experience.
Proven track record of independently driving reliability improvements.

Provide strong leadership, mentoring, and sound judgment as the Reliability Engineering lead on your team.
Design and maintain autonomous systems for building, deploying, testing, and operating all Filevine products.
Act as the authoritative voice of reliability across the full software development lifecycle (SDLC).
Monitor, aggregate, dashboard, and alert on software/infrastructure events to ensure visibility and fast response.
Continuously enhance CI/CD pipelines, automation scripts, playbooks, and tools to streamline processes.
Proactively identify and resolve gaps in system availability, performance, and security.
Document processes, architecture, procedures, and best practices.
Collaborate within your team, mentor junior engineers, and participate in 24/7 on-call rotation.

View Full Description & ApplyYou'll be redirected to the employer's site