Senior DevOps Engineer / Site Reliability Engineer
New
S
Stellar CyberCybersecurity
United StatesFull-TimeSenior
Salary165,000 - 215,000 USD per year
Apply NowOpens the employer's application page
Job Details
- Experience
- 5+ years
- Required Skills
- DockerPythonKubernetesGoGrafanaPrometheusCI/CDLinuxTerraformHelm
Requirements
- 5+ years of experience in DevOps, SRE, or Platform Engineering roles.
- Strong expertise with Kubernetes, Docker, and container orchestration.
- Hands-on experience managing production cloud environments.
- Strong Infrastructure as Code experience with Terraform and Helm.
- Experience with CI/CD tools and deployment automation.
- Advanced troubleshooting skills in Linux systems, networking, and distributed systems.
- Experience with observability platforms including Prometheus, Grafana, Loki, Alertmanager, and Elastic Stack.
- Strong programming and scripting skills in Python, Bash, or Go.
- Experience supporting high-availability production systems and on-call operations.
- Resides on the East Coast
Responsibilities
- Administer and maintain Kubernetes clusters and containerized workloads.
- Manage cloud infrastructure across OCI, AWS, GCP, or Azure environments.
- Develop and maintain CI/CD pipelines for reliable application deployments.
- Implement and manage Infrastructure as Code (IaC) using Terraform and Helm.
- Build automation tooling and operational workflows using Python, Go, or Bash.
- Drive observability initiatives including monitoring, logging, tracing, and alerting improvements.
- Monitor, troubleshoot, and resolve production incidents while participating in on-call rotations.
- Support and optimize distributed data platforms including Kafka, Elasticsearch, Spark, Redis, and MongoDB.
- Improve platform reliability, scalability, and operational efficiency using SRE best practices.
View Full Description & ApplyYou'll be redirected to the employer's site