Staff Software Engineer, Infrastructure
New
W
WisdomDental Technology
Wisdom has employees across the US.Full-TimeStaff
Salary not disclosed
Apply NowOpens the employer's application page
Job Details
- Experience
- 8+ years running production systems
- Required Skills
- AWSNode.jsPostgreSQLKubernetesTypeScriptCI/CDTerraformDatadog
Requirements
- 8+ years of experience running production systems at staff/principal scope.
- Deep AWS or GCP experience in deploying, operating, and debugging distributed services.
- Strong proficiency with infrastructure as code (Terraform), containers, and orchestration (ECS/Kubernetes).
- Hands-on production experience operating major LLM APIs (OpenAI, Anthropic, or Google Vertex AI).
- Strong command of TypeScript or JavaScript, with Python or Go as a plus.
- Deep experience with relational database management, connection management, and query performance.
- Proven track record of defining reliability functions and building incident processes from scratch.
- Ability to reason from first principles and troubleshoot complex system failures.
- Excellent collaboration and communication skills to drive technical decisions across teams.
Responsibilities
- Set the reliability strategy for the platform, including SLOs, error budgets, and operating standards.
- Own observability end-to-end, including tracing, metrics, logging, and alerting.
- Define operations for AI-powered agentic workflows, focusing on retries, backpressure, and capacity controls.
- Harden the integration surface with dental insurance carriers and practice management systems.
- Own deploy and release engineering using Terraform and CI/CD best practices.
- Build the incident practice, including on-call rotations and blameless post-mortems.
- Set technical standards via code reviews and architecture guidance to level up the engineering team.
View Full Description & ApplyYou'll be redirected to the employer's site