Site Reliability Engineer, Infra (Americas)

Posted about 2 months agoViewed
AmericasFull-TimeEmail Platform
Company:Resend
Location:Americas, EST, PST
Languages:English
Seniority level:Senior, 4+ years
Experience:4+ years
Skills:
AWSNode.jsGrafanaPostgresCI/CDLinuxDevOpsTerraformMicroservices
Requirements:
4+ years of experience in Site Reliability, Platform, or Infrastructure Engineering Fluent in writing and speaking English Strong experience with observability and monitoring tools (Datadog, Grafana, OpenTelemetry) Understand distributed systems: queues, workers, caching, databases, networking Write automation and tooling in Node.js Comfortable designing systems with safety and fail-safe operations in mind Comfortable working across the stack Care deeply about incident management, postmortems, and continuous improvement
Responsibilities:
Evolve and shape on-call processes Build automation for recovery, scaling, and self-healing systems Improve observability across the stack Define and track SLOs for core systems Collaborate with engineering teams to design for reliability Codify playbooks, postmortems, and reliability standards Work with infrastructure spanning AWS, queues, databases, and workers
Similar Jobs:
Posted 2 days ago
United StatesFull-TimeSoftware Development
Senior Full Stack Engineer
Company:Five9
Posted 2 days ago
North AmericasFull-TimeSoftware Development
Backend Engineer II - Minesweeper - Personalization
Company:
Posted 2 days ago
CanadaFull-TimeSoftware Development
Senior Software Engineer, Backend (Growth Platform)