Staff Software Engineer (Alerting & Observability)

Posted 3 months agoViewed
200000 - 245000 USD per year
United StatesFull-TimeSoftware Development
Company:Cribl
Location:United States, EST, PST
Languages:English
Seniority level:Staff
Skills:
Backend DevelopmentNode.jsSQLFrontend DevelopmentReact.jsTypeScriptAlgorithmsClickhouseData StructuresPrometheusCI/CDDevOpsMentoringSoftware Engineering
Requirements:
Strong proficiency in TypeScript/Node.js with a proven track record of building production-grade services Experience with query languages for metrics and monitoring (PromQL, SQL, or similar) and ability to write complex queries for data analysis Hands-on experience building or maintaining alerting systems, including rule evaluation engines and notification pipelines Experience with time-series databases and columnar storage systems (ClickHouse experience is a plus) Frontend development skills with React and modern JavaScript frameworks for building data visualization and management interfaces Strong understanding of distributed systems, data structures, and algorithms Experience with observability concepts including metrics, logs, traces, and their correlation Ability to work independently with minimal supervision and a track record of learning quickly Dedication to writing clean, maintainable, and well-tested code Prometheus ecosystem, including AlertManager Background in building rule engines or expression evaluation systems Experience with notification systems and integrations (PagerDuty, Slack, webhooks, etc.) Familiarity with observability tools like Grafana, ELK stack, or similar solutions Experience with CI/CD pipelines such as BitBucket, Jenkins, CircleCI, etc. Understanding of alert fatigue mitigation strategies and intelligent alerting patterns Experience with high cardinality data and performance optimization Willingness to speak your mind and share ideas Appreciation for humor and a love for goats Comfort working remotely
Responsibilities:
Design and build sophisticated alerting systems Develop query-based alert rules and expressions using PromQL, SQL, and other query languages Create intelligent alert routing, deduplication, and correlation mechanisms Build scalable backend services for alert evaluation, notification delivery, and alert management workflows Optimize time-series data storage and query performance Develop intuitive interfaces for alert configuration, visualization, and management using React Collaborate with cross-functional teams to understand monitoring requirements Mentor and guide engineers on best practices for observability and alerting architecture This position will require stand-by, on-call, or off-hours duties
About the Company
Cribl
251-500 employeesReal Time
View Company Profile
Similar Jobs:
Posted 4 months ago
North AmericaFull-TimeSoftware Development
Staff Software Engineer, Networking & Observability
Company:MongoDB
Posted about 2 months ago
North AmericaFull-TimeSoftware Development
Staff Platform Engineer, Observability
Company:Helius
Posted 5 months ago
USFull-TimeAI Research and Deployment
Software Engineer, Security Observability
Company:OpenAI