Site Reliability Engineer

Posted 4 days agoViewed
71800 - 190000 USD per year
United StatesFull-TimeCloud Contact Center Software
Company:Five9
Location:United States
Languages:English
Seniority level:Senior, 3+ years
Experience:3+ years
Skills:
AWSDockerPythonSoftware DevelopmentSQLGCPGitJavaKubernetesAzureGrafanaPrometheusNosqlCI/CDLinuxDevOpsTerraformMicroservicesAnsible
Requirements:
3+ years managing large-scale production environments. Comfortable with 24/7 on-call responsibilities and incident response. Strong Linux/Unix system administration skills. Understanding of TCP/IP, DNS, load balancing, and network security. Experience with SQL and NoSQL databases in production environments. Proficiency in at least two of: Python, Shell, PHP, Java, or similar languages. Experience with AWS, GCP, or Azure infrastructure and services. Hands-on experience with Docker, Kubernetes, and container orchestration. Experience with Prometheus, Grafana, ELK stack, or similar tools. Proficiency with Terraform, CloudFormation, or similar tools. Expert-level Git usage and collaborative development practices. Experience defining and maintaining service level objectives. Understanding of error budget concepts and implementation. Track record of identifying and eliminating repetitive manual work. Experience with performance testing and capacity management. Experience with microservices architecture and distributed systems. Knowledge of security best practices and compliance frameworks.
Responsibilities:
Design and implement dashboards for OS/platform and application monitoring. Establish and maintain SLIs, SLOs, and error budgets. Build alerting systems and performance monitoring. Participate in on-call rotations and lead incident response. Maintain CI/CD pipelines. Develop and maintain infrastructure using tools like Terraform. Automate system configuration. Ensure security scanning systems are in place. Maintain access control, authentication, and audit logging. Monitor and optimize cloud resource usage and costs. Analyze usage patterns and plan for future capacity needs. Build and maintain common services like notification systems and caching layers. Manage database reliability, performance, and scaling. Implement and maintain service discovery, load balancing, and network policies. Create and maintain tools for developer productivity and system reliability.
Similar Jobs:
Posted 1 day ago
United StatesFull-TimeSoftware Development
Senior Full Stack Engineer
Company:Five9
Posted 1 day ago
North AmericasFull-TimeSoftware Development
Backend Engineer II - Minesweeper - Personalization
Company:
Posted 1 day ago
United StatesFull-TimeSoftware Development
Software Engineer
Company:Socket