Apply

Site Reliability Engineer

Posted 2024-11-02

View full description

πŸ’Ž Seniority level: Significant and demonstrated experience as a Site Reliability Engineer

πŸ“ Location: Argentina, Bolivia, Brazil, Chile, Colombia, Ecuador, Guyana, Mexico, Nicaragua, Panama, Paraguay, Peru, Suriname, Uruguay, Venezuela

πŸ’Έ Salary: 41500 - 70000 USD per year

πŸ” Industry: Employment solutions for remote organizations

🏒 Company: Remote - Referral Board

πŸ—£οΈ Languages: English

⏳ Experience: Significant and demonstrated experience as a Site Reliability Engineer

πŸͺ„ Skills: AWSDockerNode.jsPythonJavaKubernetesGoCI/CDTerraform

Requirements:
  • Significant and demonstrated experience as a Site Reliability Engineer, including architecting, implementing, and maintaining a platform.
  • Solid knowledge and experience in Kubernetes, AWS (or similar Cloud Provider), and Terraform.
  • Knowledge of CI/CD tools, preferably GitLab CI.
  • Experience with at least one back-end programming language (Elixir, Clojure, Java, Node.js, Python, etc.).
  • Experience with one programming language for developing SRE tooling (Go, Python).
  • Excellent communication and interpersonal skills.
  • Ability to work independently and self-guided.
  • Curiosity and willingness to learn and develop.
Responsibilities:
  • Managing and improving existing infrastructure.
  • Helping build the next generation of the platform using tools like Kubernetes, Terraform, and Docker.
  • Streamlining and automating deployment processes.
  • Collaborating closely with the Security team to address potential threats.
  • Supporting engineers and product teams to enhance scalability, stability, and reliability.
Apply

Related Jobs

Apply

πŸ“ LATAM

🧭 Full-Time

πŸ’Έ 41000 - 70000 USD per year

πŸ” Remote employment services

🏒 Company: RemoteπŸ‘₯ 1001-5000πŸ’° $300.0m Series C on 2022-04-05πŸ«‚ on 2022-07-08Human Resources Services

  • Knowledge and experience in Kubernetes, AWS (or similar Cloud Provider) and Terraform.
  • Knowledge of CI/CD tools (GitLab, Github, Jenkins or similar).
  • Experience with at least 1 back-end programming language (Elixir, Clojure, Java, Node.js, Python, etc.).
  • Experience with Bash or Python Scripting.
  • Excellent communication and interpersonal skills.
  • Ability to work independently and self-guidedness.
  • Curiosity and willingness to learn and develop.
  • Holistic debugging skills.
  • Security knowledge and capabilities from a defensive and offensive standpoint.

  • Managing and improving our existing infrastructure.
  • Helping us build the next generation of our platform: using tools like Kubernetes, Terraform and Docker.
  • Streamlining and automating our deployment processes.
  • Work closely with our Security team to keep on top of potential threats/patches.
  • Support our engineers and product teams to improve overall scalability, stability and reliability.

AWSDockerNode.jsPythonBashJavaKubernetesPostgresCI/CDTerraform

Posted 2024-12-03
Apply
Apply

πŸ“ Australia, Austria, Bangladesh, Belgium, Brazil, Canada, Colombia, Costa Rica, Croatia, Czech Republic, Denmark, Egypt, Estonia, Finland, France, Germany, Ghana, Greece, India, Indonesia, Ireland, Israel, Italy, Kenya, Mexico, Netherlands, Nigeria, Peru, Poland, Singapore, South Africa, Spain, Sweden, Switzerland, Uganda, United Arab Emirates, United Kingdom, United States of America, Uruguay

πŸ’Έ 109047 - 169455 USD per year

πŸ” Nonprofit Organization, Technology

🏒 Company: Wikimedia Foundation

  • At least two years of experience in an SRE/Operations/DevOps role as part of a team.
  • Experience supporting high availability distributed production systems.
  • Experience with database administration and support.
  • Knowledge of configuration management and orchestration tools (e.g., Puppet, Ansible).
  • Familiarity with observability infrastructure (monitoring, metrics, logging).
  • Proficient in shell and scripting languages (e.g., Python, Go, Bash, Ruby).
  • Understanding of Linux/Unix fundamentals and debugging skills.
  • Excellent written and verbal communication skills.
  • BS or MS degree in Computer Science or equivalent work experience.

  • Deployment, configuration, and maintenance of distributed data systems for the data and analytics platform.
  • Implement data quality monitoring to alert the team of possible data issues.
  • Collaborate with Fundraising to integrate data from various self-hosted and third-party sources.
  • Provide engineering support during high-traffic campaigns.
  • Document internal systems and processes.
  • Ensure compliance with relevant regulations, such as Donor Privacy Policy, GDPR, and PCI DSS.
  • Manage users and permissions for data access control.
  • Advise on best practices for data input and streamline processes.

PythonBashRubyData engineeringGoCommunication SkillsCollaborationLinuxDevOpsDocumentationCompliance

Posted 2024-12-03
Apply
Apply

πŸ“ LATAM

πŸ” AI developer tools

NOT STATED

  • Reporting to the Enterprise Engineering Manager.
  • Setting up and maintaining infrastructure standards.
  • Playing a pivotal role in tool development both externally and internally.
  • Helping deploy software to enterprise customers.
  • Establishing strong partnerships with enterprise customers.
  • Managing variances in infrastructure types and implementing suitable solutions.

LeadershipCloud ComputingGitKubernetesCross-functional Team LeadershipCommunication SkillsAnalytical Skills

Posted 2024-11-10
Apply
Apply

πŸ“ Brazil

πŸ” Real Estate Technology

🏒 Company: Grupo QuintoAndar

  • Experience with Cloud environment costs.
  • Knowledge to query, analyze, and summarize data.
  • Knowledge of SQL databases.
  • Knowledge of Kubernetes architecture (Pods, Containers, Namespace, etc).
  • Experience in infrastructure as code (Terraform, Crossplane, and/or Pulumi).
  • Experience with AWS Athena & AWS Cost Explorer is a plus.
  • Knowledge of advanced Kubernetes architecture (resource allocation, scheduler, auto scaler, etc).
  • Knowledge of Presto or Trino.
  • Knowledge of programming (Python and Golang is preferred).
  • Experience with observability stack (Prometheus, Grafana, OpenTelemetry).

  • Design, analyze, and maintain reports and dashboards about cloud usage.
  • Organize and present resource consumption and cost metrics to the engineering teams.
  • Identify and correct performance problems alongside engineers.
  • Collaborate with SRE, Data, and Engineer Teams to understand cloud usage and improvements.

SQLFlashKubernetesGrafanaPrometheusCollaboration

Posted 2024-11-07
Apply
Apply

πŸ“ APAC, EMEA, AMER

πŸ” DevSecOps Software

🏒 Company: GitLabπŸ‘₯ 1001-5000πŸ’° $268.0m Series E on 2019-09-17πŸ«‚ on 2023-02-09Developer ToolsDevOpsOpen SourceSaaSCloud Security

  • Advanced datastore platform management experience, preferably using Postgres at scale.
  • Advanced Cloud Infrastructure management, preferably using GCP.
  • Advanced experience with Linux.
  • Solid experience with automation including developing infrastructure and database automations.
  • Experience with Terraform for automation.
  • Experience with orchestration tools like Chef and/or Ansible.
  • Solid experience implementing monitoring at scale, preferably using Prometheus and Grafana.
  • Willingness and ability to promote GitLab's CREDIT Values.
  • Superior verbal and written communication skills.
  • Ability to work asynchronously across timezones and cultures.

  • Build, Run, and own the entire lifecycle of the PostgreSQL database engine for GitLab.com.
  • Automate operational tasks including package updates and configuration changes.
  • Develop warning systems for maintenance tasks like library upgrades.
  • Create monitoring and alerting systems to predict capacity needs.
  • Respond to user emergencies and support requests.
  • Implement and enhance security measures for GitLab infrastructure.
  • Partner with compliance assessors for regulatory certifications.
  • Collaborate with engineering teams to resolve architectural bottlenecks.

PostgreSQLSoftware DevelopmentGCPGrafanaPostgresPrometheusCommunication SkillsCollaborationTerraform

Posted 2024-10-16
Apply
Apply

πŸ“ Brazil, Portugal

πŸ” Wellness

  • Bachelor’s degree in computer science or equivalent professional experience;
  • Technical experience with AWS cloud services and software engineering;
  • Good knowledge about Kubernetes and their ecosystem;
  • Experience with operator managed Infrastructure as Code, preferably Crossplane;
  • Ability writing software for production environments;
  • Excellent analytical and problem-solving skills, proven experience in identifying solutions for complex problems;
  • Collaboration and learning driven mindset;
  • CNCF Kubernetes Certifications (e.g. CKA, CKS or CKAD);
  • AWS Certifications;
  • You have well developed communication skills, you are capable of clearly articulate ideas when communicate to groups;
  • Ability to communicate in English.

  • Help to build a global, secure, scalable and cost effective Cloud platform using Kubernetes in AWS;
  • Develop and evolve Kubernetes operators and others cloud native automations in Kubernetes;
  • Build products and tools enabling engineering teams to create and maintain their cloud resources autonomously;
  • Help to ensure security and compliance by delivering secure products and implementing DevSecOps integrations;
  • Improve observability, reliability and cost awareness;
  • Support engineering teams in the products and tools usage;
  • Build and maintain a modern CI/CD set of tools and services;
  • Keep all the Kubernetes clusters highly-available and reliable;
  • Contribute with our products documentation (e.g. user guide, configurations, operations and troubleshooting procedures);
  • Participate in the definition of standards, RFCs (Request for Comments), guidelines and best practices;
  • Live the mission: inspire and empower others by genuinely caring for your own wellbeing and your colleagues.

AWSKubernetesCommunication SkillsCollaborationCI/CDProblem Solving

Posted 2024-09-26
Apply
Apply

πŸ“ Mexico, Colombia, Vietnam

🧭 Full-Time

πŸ” Digital services

  • Excellent communication skills with internal and external stakeholders.
  • Ability to quickly learn new tools and adapt to new technologies.
  • Experience in Development.
  • Experience with engineering and architecture in AWS, GCP or Azure.
  • Cyber Security awareness.
  • Ability to grow and mentor junior engineers.
  • Experience automating infrastructure and configuration using tools in complex projects.

  • Establishes and implements the requirements for observability (monitoring, logging, and tracing), and other technologies used to validate the health of the systems and applications.
  • Works with the engineering team to deep dive into reliability issues at different levels such as software, application, and network.
  • Creates the strategy in collaboration with the technical leaders for the creation, automation, and deployment of infrastructure in the cloud.
  • Makes decisions that impact the architecture, workflow, and technologies that are fundamental for the functionality of the systems and applications.
  • Encourages the use of best practices among the team and the company.
  • Writes or implements tools to improve the delivery of applications.

AWSGCPStrategyAzureCommunication SkillsCollaboration

Posted 2024-09-15
Apply
Apply

πŸ“ Australia, Austria, Bangladesh, Belgium, Brazil, Canada, Colombia, Costa Rica, Croatia, Czech Republic, Denmark, Egypt, Estonia, Finland, France, Germany, Ghana, Greece, India, Indonesia, Ireland, Israel, Italy, Kenya, Mexico, Netherlands, Nigeria, Peru, Poland, Singapore, South Africa, Spain, Sweden, Switzerland, Uganda, United Arab Emirates, United Kingdom, United States of America, Uruguay

🧭 Full-Time

πŸ’Έ 109047 - 169455 USD per year

πŸ” Nonprofit / Technology

  • At least two years experience in an SRE/Operations/DevOps role as part of a team.
  • Experience supporting high availability distributed production systems.
  • Experience with database administration and support.
  • Comfortable with configuration management and orchestration tools (e.g., Puppet, Ansible, Chef, SaltStack).
  • Knowledge of modern observability infrastructure (monitoring, metrics, and logging).
  • Proficient in shell and scripting languages such as Python, Go, Bash, Ruby.
  • Good understanding of Linux/Unix fundamentals and debugging skills.
  • Excellent written and verbal communication skills.
  • BS or MS degree in Computer Science or equivalent work experience.

  • The Deployment, configuration and maintenance of the distributed data systems that comprise our data and analytics platform.
  • Implement data quality monitoring that alerts the team of possible data issues.
  • Collaborate closely with the Fundraising team to integrate and use data from self-hosted and third-party sources.
  • Provide engineering support during high-traffic or critical campaigns.
  • Write and update internal documentation of systems and processes.
  • Ensure compliance with regulations like the Donor Privacy Policy, GDPR, and PCI DSS.
  • Create and manage users and permissions for data access control.
  • Advise on data input best practices and develop processes for data entry consistency.
  • Work closely with Fundraising Analytics to gather and prioritize data enhancement requests.

PythonBashRubyC (Programming language)Data engineeringGoCommunication SkillsCollaboration

Posted 2024-08-22
Apply
Apply

πŸ“ Americas

🧭 Full-Time

πŸ” Open source technology

🏒 Company: CanonicalπŸ‘₯ 1001-5000πŸ’° $12.8m Crowdfunding on 2013-08-22Internet of ThingsOpen SourceCloud ComputingLinuxSoftware

  • Software Engineering or Computer Science degree.
  • Linux experience and familiarity with Linux networking and storage.
  • Python software development experience.
  • Demonstrated drive for continual learning.
  • DevOps experience.

  • Bring Python software-engineering skills to the operations domain.
  • Architect and run OpenStack, Kubernetes, and software-defined storage.
  • Enable devsecops for applications running on the managed infrastructure.
  • Work in high-pressure operations environment with mission-critical services.

PythonSoftware DevelopmentKubernetesServerless

Posted 2024-08-07
Apply