Apply

Sr. Site Reliability Engineer

Posted 2024-10-20

View full description

💎 Seniority level: Senior

📍 Location: United States

🏢 Company: ARFA Solutions, LLC

🪄 Skills: LeadershipSoftware DevelopmentAgileFrontend DevelopmentHTMLCSSJavaJavascriptSCRUMCommunication SkillsAnalytical SkillsCollaborationCI/CDJavaScript

Requirements:
  • Integrated with scrum team 50% of time
  • Monitoring performance and proactively collaborating with other SREs
  • Conducting initiatives not related to team work
  • Ensuring correct logging, keeping synthetics updated, and implementing fail safes
  • Troubleshooting front end JavaScript performance issues
  • Formulating alerts based on log analysis
Responsibilities:
  • Design and implement highly available and scalable infrastructure solutions
  • Develop and maintain automated deployment, configuration, and monitoring processes
  • Collaborate with cross-functional teams to ensure the reliability, security, and performance of systems
  • Identify and resolve performance and availability issues through proactive monitoring and alerting
  • Participate in incident response and troubleshooting efforts
  • Implement and improve disaster recovery and business continuity strategies
  • Maintain documentation and keep up-to-date with industry best practices
  • Stay current with emerging technologies and trends in the field of SRE
  • Lead and mentor junior members of the SRE team
Apply

Related Jobs

Apply

📍 United States

🧭 Full-Time

💸 147100 - 207600 USD per year

🔍 Cloud Infrastructure and Software Engineering

🏢 Company: HashiCorp

  • Professional experience designing or operating disaster recovery processes in a distributed cloud environment.
  • Professional experience with incident management in cloud environments.
  • Enjoy working on various scopes spanning software engineering, cloud infrastructure, and SRE.
  • Experience contributing to efficiency improvements of software at scale.
  • Experience collaborating cross-functionally to deliver engineering culture change.
  • Worked on infrastructure teams in customer-centric and agile organizations with empathy and compassion.
  • Worked with SaaS or other managed software offerings.
  • Experience in one or more of the major public clouds.

  • Utilize software engineering experience to solve problems and build automation for incident lifecycle management.
  • Coordinate disaster recovery processes and identify strategic process improvements.
  • Drive incident management capabilities and culture.
  • Participate in incident command on-call rotation.
  • Support incident management tooling.
  • Build technical skills and relationships within a team of engineers and SREs.
  • Learn, teach, and collaborate cross-functionally.

AgileProduct DevelopmentStrategyCommunication SkillsCollaboration

Posted 2024-11-12
Apply