Senior Sustaining & Forward Deployed Engineer

New
A
Abacus InsightsHealth Tech
Remote USFull-TimeSenior
Salary not disclosed
Apply NowOpens the employer's application page

Job Details

Experience
10+ years
Required Skills
AWSPythonCI/CDDatabricksDistributed Systems

Requirements

  • 10+ years of experience in software engineering, SRE, or production operations.
  • Deep hands-on experience operating production systems in AWS.
  • Strong experience troubleshooting Databricks and large-scale data platforms.
  • Proficiency in Python.
  • Understanding of distributed systems.
  • Strong experience in incident management and RCA practices.
  • Knowledge of monitoring, alerting, and observability.
  • Experience with CI/CD pipelines that leverage Infrastructure as Code.
  • Ability to own problems end-to-end, from detection to resolution.
  • Excellent communication skills for incidents and customer escalations.

Responsibilities

  • Lead real-time incident triage, mitigation, and recovery efforts.
  • Drive root cause analysis (RCA) with a focus on systemic fixes.
  • Own post-launch reliability and operational quality of core systems.
  • Investigate and resolve complex field issues and production defects.
  • Engage directly with strategic customers for deployments and high-impact issues.
  • Write production-quality code to automate workflows and improve observability.
  • Mentor engineers through pairing, reviews, and incident leadership.
View Full Description & ApplyYou'll be redirected to the employer's site
View details
Apply Now