Apply

Senior Site Reliability Engineer

Posted 3 months agoViewed

View full description

💎 Seniority level: Senior

📍 Location: Spain

🔍 Industry: Mobility services

🏢 Company: Cabify👥 1001-5000💰 $16,473,668 Debt Financing about 1 year agoInternetLogisticsRide SharingTransportationMobile

🗣️ Languages: English

🪄 Skills: AWSAWS EKSKubernetesMicroservicesNetworking

Requirements:
  • Strong knowledge of Unix, networking stack, OSI model, containers, and monitoring.
  • Programming skills in at least one language; capability to learn others.
  • Natural tendency to automate tasks.
  • Effective and asynchronous communication skills.
  • Care for the company, team, and self.
  • Embrace diversity and humility.
  • Action-oriented and iterative problem solving.
  • Preference for simplicity over complexity.
  • Ability to identify and address bottlenecks.
  • Proficiency in English communication.
Responsibilities:
  • Evolving our infrastructure platform building self-service components.
  • Working closely with Product and Infrastructure teams to develop infrastructure components.
  • Designing and implementing tooling for service availability, scalability, observability, and latency improvements.
  • Increasing reliability awareness with teams and reviewing implementations.
  • Defining SLIs, SLOs and SLAs as part of services' lifecycle.
  • Sharing an on-call schedule for owned platform services.
  • Solving problems in a highly available platform and building automations to prevent incidents.
  • Participating in the recruiting process to grow the engineering team.
Apply

Related Jobs

Apply

📍 United States, European timezones

🧭 Full-Time

🔍 Software Development

🏢 Company: Invert👥 11-50💰 $20,149,993 Seed 8 months agoData ManagementSaaSApplication Performance Management

NOT STATED
  • Design, build, and maintain scalable and secure cloud infrastructure as code
  • Develop and enforce Service Level Indicators (SLIs) and Service Level Objectives (SLOs) to ensure software reliability
  • Enable cost transparency and optimize infrastructure spending
  • Reduce cognitive load for product engineers by creating streamlined, efficient development workflows
  • Build and maintain robust CI/CD pipelines that accelerate time from code to customer
  • Create and maintain intuitive, comprehensive observability solutions for end-to-end system monitoring
  • Lead and continuously improve our Incident Management process
  • Participate in the on-call rotation, serving as a First Responder to quickly address and resolve system issues
  • Develop and maintain incident response playbooks and post-mortem practices

AWSDockerCI/CDLinuxTerraform

Posted 20 days ago
Apply
Apply

📍 Europe

🧭 Full-Time

🔍 Software Development

🏢 Company: Sanity👥 51-200💰 Corporate over 2 years agoSoftware Development

  • Proven experience with SRE/DevOps tools, processes, and culture.
  • Proficient in programming languages like Python, Go, and TypeScript.
  • 5+ years of experience participating in an SRE on-call rotation.
  • Analytical mindset for designing, diagnosing, and optimizing infrastructure.
  • Skilled in managing scalable, highly available, cloud-based applications.
  • Hands-on experience with Kubernetes for orchestrating, scaling, and managing containerized applications in the cloud.
  • Strong database management skills, particularly with PostgreSQL.
  • Experience with infrastructure as code, using tools like Terraform.
  • Proficient in building and maintaining CI/CD pipelines.
  • Familiarity with observability tools like Prometheus and similar stacks.
  • Calm and clear-headed in incident and outage situations, with a thoughtful communication style for high-pressure environments.
  • Open-minded yet discerning when it comes to exploring new technologies.
  • Plan and implement a global platform for delivering our software as a service.
  • Diagnose and troubleshoot complex distributed systems.
  • Ensure observability and analyze the behavior of our stack.
  • Orchestration, deployment, monitoring, automation.
  • Participate in our on-call rotation.

PostgreSQLPythonCloud ComputingElasticSearchKubernetesTypeScriptGoPrometheusCI/CDLinuxDevOpsTerraformMicroservices

Posted 20 days ago
Apply
Apply

📍 Worldwide

🧭 Contract

🔍 Software Development

🏢 Company: Teravision Technologies👥 251-500💰 over 13 years agoAndroidiOSMobile AppsInformation TechnologySoftware

  • Experience managing and maintaining Kubernetes (K8s) infrastructure, including updates, patching, and software configuration management.
  • Familiarity with CI/CD pipelines, particularly TeamCity, and integrating tools like SonarQube.
  • Hands-on experience with AWS services such as S3, Route 53, and others.
  • Strong understanding of backend systems and infrastructure management.
  • Proficiency in troubleshooting, debugging, and ensuring system reliability in production environments.
  • Prior experience in an on-call role.
  • Knowledge of monitoring and alerting tools to support on-call responsibilities.
NOT STATED

AWSKubernetesCI/CDTroubleshootingDebugging

Posted about 2 months ago
Apply