Senior Site Reliability Engineer

New

MoniepointFinancial Technology

Remote, IndiaFull-TimeSenior

Salary not disclosed

Apply NowOpens the employer's application page

Job Details

Minimum of 5 years of experience in SRE or Backend Engineering.
Strong ability to write clean, performant, and tested code in Java, Go, Rust, or Python.
Deep understanding of distributed systems architecture and design patterns.
Proficiency in microservices fundamentals and event-driven architectures.
Extensive experience with Google Cloud Platform (GCP) or similar cloud providers like AWS or Azure.
Proficient in running production workloads on Kubernetes (GKE/EKS) and troubleshooting cluster/infrastructure issues.
Experience designing observability strategies using OpenTelemetry, Prometheus, New Relic, Datadog, or SigNoz.
Familiarity with operating and tuning production data stores like PostgreSQL or MySQL.
Experience working with streaming platforms such as Kafka or RabbitMQ in high-throughput environments.

Participate in on-call rotations as the primary technical lead and act as Incident Commander during major severity incidents.
Instrument code to expose high-cardinality metrics and distributed traces.
Collaboratively define, measure, and defend Service Level Objectives (SLOs) and Error Budgets with product owners.
Write high-quality, production-ready code in Java, Go, or Python to build internal tooling, automation, and self-healing mechanisms.
Partner with Product Engineering teams during the design phase to implement reliability, scalability, and observability patterns.
Analyze system performance and traffic patterns to model capacity needs.
Conduct load testing and chaos engineering experiments to verify system resilience.

View Full Description & ApplyYou'll be redirected to the employer's site