Junior Site Reliability Engineer

Must be located in the United StatesFull-TimeJunior
Salary not disclosed
Apply NowOpens the employer's application page

Job Details

Experience
2+ years experience in 24x7x365 production operations, 2+ years experience installing, managing, and troubleshooting Linux and/or Windows Server operating systems in a production environment, 2+ years experience supporting cloud operations and automation in AWS, Azure or GCP, 2+ years experience with Infrastructure-as-Code and orchestration/automation tools such as Terraform and Ansible
Required Skills
AWSPythonBashGCPJiraAzureLinuxTerraformNetworkingAnsibleServiceNow

Requirements

  • BS or above in related Information Technology field or equivalent combination of education and experience
  • 2+ years experience in 24x7x365 production operations
  • Fundamental understanding of networking and networking troubleshooting
  • 2+ years experience installing, managing, and troubleshooting Linux and/or Windows Server operating systems in a production environment
  • 2+ years experience supporting cloud operations and automation in AWS, Azure or GCP (and aligned certifications)
  • 2+ years experience with Infrastructure-as-Code and orchestration/automation tools such as Terraform and Ansible
  • Experience with IaaS platform capabilities and services (cloud certifications expected)
  • Experience within ticketing tool solutions such as Jira and ServiceNow
  • Experience using environmental analytics tools such as Splunk and Elastic Stack for querying, monitoring and alerting
  • Experience in at least one primary scripting language (Bash, Python, PowerShell)
  • Excellent communication, organizational, and problem-solving skills in a dynamic environment
  • Effective documentation skills, to include technical diagrams and written descriptions
  • Ability to work as part of a team with professional attitude and demeanor

Responsibilities

  • Become a member of a highly collaborative engineering team offering a unique blend of Cloud Infrastructure Administration, Site Reliability Engineering, Security Operations, and Vulnerability Management across multiple clients
  • Coordinate with client product teams, engineering team members, and other stakeholders to monitor and maintain a secure and resilient cloud-hosted infrastructure to established SLAs in both production and non-production environments
  • Innovate and implement using automated orchestration and configuration management techniques
  • Understand the design, deployment, and management of secure and compliant enterprise servers, network infrastructure, boundary protection, and cloud architectures using Infrastructure-as-Code
  • Create, maintain, and peer review automated orchestration and configuration management codebases, as well as Infrastructure-as-Code codebases
  • Maintain IaC tooling and versioning within Client environments
  • Implement and upgrade client environments with CI/CD infrastructure code and provide internal feedback to development teams for environment requirements and necessary alterations
  • Work across AWS, Azure and GCP, understanding and utilizing their unique native services in client environments
  • Configure, tune, and troubleshoot cloud-based tools, manage cost, security, and compliance for the Client’s environments
  • Monitor and resolve site stability and performance issues related to functionality and availability
  • Work closely with client DevOps and product teams to provide 24x7x365 support to environments through Client ticketing systems
  • Support definition, testing, and validation of incident response and disaster recovery documentation and exercises
  • Participate in on-call rotations as needed to support Client critical events, and operational needs that may lay outside of business hours
  • Support testing and data reviews to collect and report on the effectiveness of current security and operational measures, in addition to remediating deviations from current security and operational measures
  • Maintain detailed diagrams representative of the Client’s cloud architecture
  • Maintain, optimize, and peer review standard operating procedures, operational runbooks, technical documents, and troubleshooting guidelines
View Full Description & ApplyYou'll be redirected to the employer's site
View details
Apply Now