ApplySRE - Cosmos
Posted about 1 month agoViewed
View full description
📍 Location: Spain
🔍 Industry: Software Development
🏢 Company: P2P. org
🗣️ Languages: English
🪄 Skills: PythonBlockchainGCPKubernetesGrafanaPrometheusCI/CDRESTful APIsLinuxDevOpsTerraformAnsibleScripting
Requirements:
- Extensive hands-on experience administering Linux-based systems, including both hardware and software aspects.
- Proven ability to implement IaC using tools like Terraform, Ansible, and Git.
- Demonstrated success managing workloads on cloud providers such as Google Cloud Platform (GCP) and Oracle Cloud.
- Experience deploying and managing applications on Kubernetes with tools like ArgoCD, Argo workflows, GitHub Actions, Helm, and HashiCorp Vault.
- Proficiency in scripting in Shell, and at least one programming language (Python or Golang) to automate infrastructure tasks and reduce manual effort.
- Skilled in configuring comprehensive observability solutions (Prometheus, Grafana, Loki, OTEL agent) to ensure prompt and accurate incident response.
- Hands-on experience running and configuring blockchain nodes/validators, especially within the Cosmos / Tendermint ecosystems.
Responsibilities:
- Provision, maintain, and scale multi-cloud/multi-architecture infrastructure using Infrastructure as Code (IaC) tools and CI/CD pipelines.
- Develop and manage Kubernetes workloads following GitOps best practices.
- Create and maintain Ansible roles to deploy and manage various blockchain validators.
- Implement and refine monitoring, alerting, and logging solutions using Prometheus, Grafana, Loki, and Opsgenie.
- Continuously improve reliability through proactive security patches, system hardening, and performance tuning.
- Build and maintain CI/CD workflows, ensuring seamless deployments to Kubernetes clusters.
- Contribute to open-source tooling that supports the Bitcoin and Tendermint ecosystems.
- Engage in architecture discussions and technical presentations, influencing the direction of core infrastructure.
- Collaborate closely with passionate engineers, DevOps specialists, and community contributors across Web3.
- Participate in a 24/7 on-call rotation, ensuring rapid response and resolution to critical infrastructure incidents.
Apply