Apply

SRE - Cosmos

Posted about 1 month agoViewed

View full description

📍 Location: Spain

🔍 Industry: Software Development

🏢 Company: P2P. org

🗣️ Languages: English

🪄 Skills: PythonBlockchainGCPKubernetesGrafanaPrometheusCI/CDRESTful APIsLinuxDevOpsTerraformAnsibleScripting

Requirements:
  • Extensive hands-on experience administering Linux-based systems, including both hardware and software aspects.
  • Proven ability to implement IaC using tools like Terraform, Ansible, and Git.
  • Demonstrated success managing workloads on cloud providers such as Google Cloud Platform (GCP) and Oracle Cloud.
  • Experience deploying and managing applications on Kubernetes with tools like ArgoCD, Argo workflows, GitHub Actions, Helm, and HashiCorp Vault.
  • Proficiency in scripting in Shell, and at least one programming language (Python or Golang) to automate infrastructure tasks and reduce manual effort.
  • Skilled in configuring comprehensive observability solutions (Prometheus, Grafana, Loki, OTEL agent) to ensure prompt and accurate incident response.
  • Hands-on experience running and configuring blockchain nodes/validators, especially within the Cosmos / Tendermint ecosystems.
Responsibilities:
  • Provision, maintain, and scale multi-cloud/multi-architecture infrastructure using Infrastructure as Code (IaC) tools and CI/CD pipelines.
  • Develop and manage Kubernetes workloads following GitOps best practices.
  • Create and maintain Ansible roles to deploy and manage various blockchain validators.
  • Implement and refine monitoring, alerting, and logging solutions using Prometheus, Grafana, Loki, and Opsgenie.
  • Continuously improve reliability through proactive security patches, system hardening, and performance tuning.
  • Build and maintain CI/CD workflows, ensuring seamless deployments to Kubernetes clusters.
  • Contribute to open-source tooling that supports the Bitcoin and Tendermint ecosystems.
  • Engage in architecture discussions and technical presentations, influencing the direction of core infrastructure.
  • Collaborate closely with passionate engineers, DevOps specialists, and community contributors across Web3.
  • Participate in a 24/7 on-call rotation, ensuring rapid response and resolution to critical infrastructure incidents.
Apply