Senior Consultant Service Reliability Engineer

New

IndiaFull-TimeSenior

Salary not disclosed

Apply NowOpens the employer's application page

Job Details

Required Skills: AWSDockerPythonBashGCPKubernetesAzureGoDevOps

Requirements

Strong experience in Site Reliability Engineering, DevOps, infrastructure engineering, or related fields.
Hands-on expertise with programming or scripting languages such as Python, Go, or Bash.
Solid understanding of at least one major cloud platform including AWS, Azure, or GCP.
Experience with observability and monitoring tools such as Grafana, Datadog, ELK Stack, Dynatrace, New Relic, or similar platforms.
Familiarity with DevOps and GitOps methodologies and CI/CD practices.
Strong knowledge of containerization and orchestration technologies including Kubernetes, Docker, AWS EKS, or similar platforms.
Understanding of microservices architecture, RESTful APIs, serverless systems, and modern cloud-native design patterns.
Ability to troubleshoot complex infrastructure and production issues using logs, metrics, and monitoring data.
Excellent communication, collaboration, and stakeholder management skills.
Strong ownership mindset with the ability to work independently in high-pressure environments.
Flexibility to participate in rotational and on-call support schedules.

Responsibilities

Improve system reliability and resilience by implementing fault-tolerant architectures and automation strategies.
Enhance monitoring, observability, and alerting systems to reduce operational overhead and improve incident detection and response times.
Manage production incidents, coordinate communication with stakeholders, and conduct root cause analysis investigations.
Collaborate with development teams to improve application reliability, scalability, and operational readiness.
Integrate observability and automation practices into CI/CD pipelines and DevOps workflows.
Monitor system performance and optimize infrastructure to meet SLA and SLO objectives.
Implement and maintain cloud-native infrastructure solutions aligned with reliability and security best practices.
Drive continuous improvement initiatives including chaos engineering and proactive reliability testing.
Build operational dashboards, metrics, and logging solutions to improve visibility across distributed systems.
Support 24x7 operational needs through rotational or on-call responsibilities when required.

View Full Description & ApplyYou'll be redirected to the employer's site

Similar Jobs

Senior Site Reliability Engineer, Infrastructure Foundations

Wikimedia Foundation

Please note that we are currently able to hire in the following: US States: Arizona, California, Colorado, Connecticut, District of Columbia*, Florida, Georgia, Idaho, Illinois, Indiana, Iowa, Maryland, Massachusetts, Michigan, Minnesota, Missouri, New Jersey, New Mexico, New York, North Carolina, Ohio, Oklahoma, Oregon, Pennsylvania, Puerto Rico*, Rhode Island, Tennessee, Texas, Utah, Vermont, Virginia, Washington, West Virginia, Wisconsin and Wyoming (*US Territory or Federal District) Countries: Brazil, Canada, Colombia, France, Germany, Ghana, India, Indonesia, Italy, Kenya*, Mexico, Morocco, Netherlands, Poland, Singapore*, South Africa, Spain, Switzerland and the United Kingdom.Full-Time

113,082 - 175,725 USD per year

View Job

Senior DevOps Engineer

Eltropy Inc.

100% Remote - IndiaFull-Time

View Job

Senior DevOps Engineer

IndiaFull-Time

View Job