Sr./Staff - Infrastructure/Site Reliability Engineer (SRE)

Posted 5 months agoViewed

United StatesFull-TimeAI Risk Decisioning

Company:Oscilar

Location:United States

Languages:English

Seniority level:Senior, Proven track record as a senior SRE or Infrastructure Engineer in high-scale environments.

Experience:Proven track record as a senior SRE or Infrastructure Engineer in high-scale environments.

Skills:

AWSDockerJavaKafkaKubernetesClickhouseGoCI/CDDevOpsTerraformMicroservices

Requirements:

Proven track record as a senior SRE or Infrastructure Engineer in high-scale environments Expert-level skills in AWS and Infrastructure as Code (Pulumi, Terraform) Strong programming ability in Go or Python Deep understanding of distributed systems (Kafka, ClickHouse) and microservices architecture Mastery of container orchestration (Kubernetes) and production debugging Strong sense of ownership, and the judgment to balance velocity with reliability

Responsibilities:

Architect and operate resilient cloud infrastructure (AWS, Pulumi, Kubernetes) Lead initiatives to improve availability, latency, and performance at scale Design and evolve CI/CD pipelines Define metrics, alerts, and runbooks for observability Run chaos experiments and failure simulations Mentor engineers and set SRE best practices