Senior Site Reliability Engineer

Posted 2024-10-26

View full description

💎 Seniority level: Senior, 5+ years

💸 Salary: 120000 - 140000 USD per year

🔍 Industry: Healthcare

🏢 Company: Hone Health

🗣️ Languages: English

⏳ Experience: 5+ years

🪄 Skills: Microservices

Requirements:

Proven SRE experience in a highly complex, mission-critical environment (5+ years).
Excellent problem-solving skills, with the ability to troubleshoot complex technical issues.
Extensive experience with Azure cloud technologies, including designing and optimizing complex architecture.
Deep understanding of cloud architecture, microservices, and containerization.
Strong understanding of cloud security principles, including security audits and compliance.
Proficiency in infrastructure as code and CI/CD pipelines.
Experience with disaster recovery planning and business continuity.
Strong interpersonal skills for effective collaboration across teams.
Azure certifications are a plus.
Background in managing EHR/EMR systems in healthcare is a strong plus.

Responsibilities:

Oversee the management, optimization, and scaling of EHR/EMR cloud infrastructure.
Serve as the primary custodian of the EHR/EMR platform ensuring high availability and data integrity.
Manage IT infrastructure including servers, networking, and endpoints.
Implement security practices and compliance measures to protect healthcare data.
Develop disaster recovery plans and maintain business continuity.
Collaborate with cross-functional teams to align technology with company goals.
Maintain documentation of technology operations processes and system architecture.
Implement monitoring tools to proactively detect issues and optimize performance.

Apply

Related Jobs

Apply

🔥 Senior Site Reliability Engineer (Czech Republic remote)

Posted 2024-11-22

📍 Czech Republic

🔍 Software Infrastructure

NOT STATED

Solve complex system problems using automation.
Address storage issues with innovative software-defined solutions.
Manage network challenges effectively.

Posted 2024-11-22

Apply

🔥 Senior Site Reliability Engineer (SRE)

Posted 2024-11-22

🔍 Commission management

NOT STATED

Operate across the engineering organization to support development teams with the needed tools and processes.
Ensure great service quality for paying customers and keep the business informed when issues arise.
Provide infrastructure, platform, reliability, and observability support to internal customers.
Invest in iterative efforts to refine work and deliver real-world results while improving processes.

Posted 2024-11-22

Apply

🔥 Senior Site Reliability Engineer

Posted 2024-11-21

📍 United States

🧭 Full-Time

🔍 Legal technology

🏢 Company: Ramp Talent

Curiosity, willingness to learn, and passion for continuous improvement.
Proficiency in all skills expected of SRE II's.
Bachelor's degree in computer science, information systems, related field; comparable certifications; or equivalent direct work experience.
A minimum of 8 years of experience in hands-on technical roles.
A minimum of 2 years of Site Reliability Engineering experience.
Experience building autonomous systems that manage software operational details without human intervention.

Developing autonomous systems that manage the details necessary to build, deploy, test, and operate all Filevine Inc. products.
Being the voice of Reliability on your team throughout the SDLC.
Collecting, monitoring, aggregating, dashboarding, and alerting on software and server events.
Improving the CI/CD pipeline.
Developing playbooks, tools, and scripts to streamline processes and shorten problem resolution time.
Identifying and fixing gaps in the availability of systems.
Improving and defending the security of software and systems.
Documenting and diagramming processes, procedures, and best practices.
Finding, learning, improving, or creating new tools that are reliable, usable, and helpful.
Mentoring, training, and reviewing more junior engineers.
Participating in an on-call rotation for 24/7 production reliability support.

LeadershipCI/CDMentoring

Posted 2024-11-21

Apply

🔥 Senior Site Reliability Engineer (Poland remote)

Posted 2024-11-21

📍 Poland

🔍 Software

Posted 2024-11-21

Apply

🔥 Senior Site Reliability Engineer (SRE) - Disaster Recovery Specialist (m/f/x)

Posted 2024-11-21

🧭 Full-Time

🔍 Software / SaaS

Degree in Computer Science, Information Technology, or a related field.
5+ years of hands-on experience in site reliability engineering, ideally with a focus on disaster recovery.
Experience in a cloud-based SaaS environment.
Strong expertise in designing and implementing disaster recovery solutions using industry-leading technologies and methodologies.
Proficiency in cloud platforms such as AWS, Azure, or Google Cloud Platform.
Experience with infrastructure as code (IaC) tools such as Terraform or CloudFormation.
Excellent communication skills with the ability to effectively collaborate with cross-functional teams and communicate technical concepts to non-technical stakeholders.

Design, implement, and maintain disaster recovery solutions for cloud-based SaaS environments.
Develop and document comprehensive disaster recovery plans, procedures, and runbooks.
Conduct drills and exercises to test and validate the effectiveness of these plans.
Collaborate with engineering, operations, and security teams to identify and mitigate potential risks to system availability and data integrity.
Monitor system performance and health metrics; proactively identify areas for improvement.
Implement preventive measures to enhance system reliability and resilience.
Participate in incident response and post-incident reviews; analyze root causes of failures.
Implement corrective actions to prevent recurrence.

Posted 2024-11-21

Apply

🔥 Senior Site Reliability Engineer (SRE) - Disaster Recovery Specialist (m/f/x)

Posted 2024-11-20

🧭 Full-Time

🔍 Software Development

Degree in Computer Science, Information Technology, or a related field.
5+ years of hands-on experience in site reliability engineering, ideally with a focus on disaster recovery.
Strong expertise in designing and implementing disaster recovery solutions using leading technologies.
Proficiency in cloud platforms such as AWS, Azure, or Google Cloud Platform.
Experience with infrastructure as code (IaC) tools like Terraform or CloudFormation.
Excellent communication skills for collaboration with cross-functional teams and non-technical stakeholders.

Design, implement, and maintain disaster recovery solutions for a cloud-based SaaS environment.
Develop and document comprehensive disaster recovery plans, procedures, and runbooks.
Conduct drills and exercises to validate the effectiveness of disaster recovery plans.
Collaborate with engineering, operations, and security teams to identify and mitigate risks.
Proactively monitor system performance and health metrics, implement preventive measures.
Participate in incident response and post-incident reviews to analyze root causes and implement corrective actions.

Posted 2024-11-20

Apply

🔥 Senior Site Reliability Engineer

Posted 2024-11-17

📍 Canada

🔍 Software Supply Chain Management

🏢 Company: FOSSA

Strong, demonstrated experience as a technical lead designing, building, and maintaining scalable infrastructure and tooling.
Strong knowledge of at least one cloud platform and maintaining managed services (we use AWS).
Strong experience implementing Infrastructure as Code using Terraform, Helm, and Kubernetes.
Experience building and maintaining build pipelines, deploying new services, and familiarity with CI/CD tools such as Buildkite, CircleCI, and GitHub Actions.
Experience with logging and monitoring tools such as Datadog, Statsd, Prometheus, Grafana.
Experience with packaging and deploying services using Docker on Linux.
Ability to break down complex problems, troubleshoot, drive towards a solution, and communicate it with the team and stakeholders.
Willingness to accept feedback and incorporate it into work.
Experience with source control tooling and processes, including branching, merging, and rebasing (we use git).
Willingness to take part in an on-call rotation.

Scale cloud infrastructure to meet increasing demand.
Assist development teams in deploying new services.
Ensure platform security and adherence to best practices.
Improve development tools, CI/CD pipelines, monitoring, and release processes.
Help teams use Helm and Kubernetes, and shape best practices.
Build access control and secret management solutions.
Maintain deployments for on-premise customers.

AWSDockerGitKubernetesGrafanaPrometheusCI/CDLinuxTerraform

Posted 2024-11-17

Apply

🔥 Senior Site Reliability Engineer

Posted 2024-11-16

📍 U.S.

🧭 Full-Time

💸 140000 - 160000 USD per year

🔍 Cybersecurity / Open source software

Sense of curiosity, resourcefulness, and pragmatism.
Expertise with multi-region deployments in public cloud environments.
Demonstrable production Kubernetes experience (Managed Kubernetes, Helm, kubectl, kOps, etc.).
Strong background in Reliability Engineering, DevOps, Software Engineering.
Fluency with at least one programming language, such as C#, Python, or Go.
Experience with cloud deployment and automation tools/methodologies (i.e. GitOps, Terraform, Pulumi).
Proficiency using source control such as Git.
Ability to maintain discretion and handle sensitive information.
Staying current with trends and new technologies.
Collaborative and adaptable mindset.
Excellent communication skills.
Strong problem-solving skills.

Take ownership of the Bitwarden cloud infrastructure, focusing on quality.
Evaluate infrastructure regularly, making recommendations for reliability, security, availability, scalability, and cost management.
Implement site reliability tools and observability systems.
Respond to outages and participate in a 24x7 support strategy.
Contribute to architectural designs and engineering operations at scale.
Engage in code reviews and spread technical knowledge.
Contribute to incident management processes.
Collaborate with teams to refine priorities and deliverables.
Align SLIs, SLOs, and SLAs with product owners.
Identify opportunities for new initiatives.
Influence the SDLC as Bitwarden scales.
Mentor team members.

PythonGitKubernetesC#StrategyGoCommunication SkillsDevOpsTerraform

Posted 2024-11-16

Apply

🔥 Senior Site Reliability Engineer (SRE)

Posted 2024-11-12

🧭 Contract

Minimum of 5-7 years experience in Site Reliability Engineering or related fields.
Proven experience designing and implementing fault-tolerant, scalable systems.
Deep understanding of reliability methodologies like DFR, FMEA, and MTBF.
Proficiency with tools such as DataDog, PagerDuty, Marvin, Backstage.
Strong coding skills in one or more programming languages relevant to SRE.
Exceptional analytical skills for complex issue investigation.
Willingness to learn new products and tools.
Excellent communication skills for a distributed team environment.

Identify and resolve complex bugs within the codebase.
Enhance system reliability, scalability, and performance through code maintenance.
Restart services and implement necessary code changes.
Investigate complex system issues and develop resolutions.
Design and build fault-tolerant, scalable systems for high availability.
Apply methodologies like DFR, FMEA, and MTBF.
Develop and maintain reliability standards and documentation.

Posted 2024-11-12

Apply

🔥 Senior Site Reliability Engineer (SRE) - LATAM (Remote)

Posted 2024-11-10

📍 LATAM

🔍 AI development tools

Leverage skills, knowledge, and adaptability to address complex infrastructure needs.
Provide high-quality solutions tailored to each enterprise customer's unique requirements.

Report to the Enterprise Engineering Manager.
Set up and maintain infrastructure standards.
Play a pivotal role in external and internal tool development.
Facilitate software deployment to enterprise customers.
Establish partnerships with enterprise customers to improve satisfaction.
Manage variances in infrastructure types and implement suitable solutions.

LeadershipCloud ComputingGitKubernetesCross-functional Team LeadershipCommunication SkillsAnalytical Skills

Posted 2024-11-10

Apply

Senior Site Reliability Engineer

Requirements:

Responsibilities:

Related Jobs

🔧 Requirements

💡 Responsibilities

🔧 Requirements

💡 Responsibilities

🔧 Requirements

💡 Responsibilities

🔧 Requirements

💡 Responsibilities

🔧 Requirements

💡 Responsibilities

🔧 Requirements

💡 Responsibilities

🔧 Requirements

💡 Responsibilities

🔧 Requirements

💡 Responsibilities

🔧 Requirements

💡 Responsibilities