Infrastructure Operations Engineer

New
N
NscaleGPU Cloud AI
Join our thriving remote-first team. Geography is no barrier to impact or connection.Full-TimeMiddle
Salary not disclosed
Apply NowOpens the employer's application page

Job Details

Required Skills
PythonBashGitKubernetesLinuxTerraformNetworkingAnsible

Requirements

  • Platform and DC fundamentals: servers, networks, storage, and virtualisation.
  • Linux fundamentals: CLI, systemd, filesystems, permissions, and networking tools.
  • Networking basics: IP addressing, subnets, VLANs, routing, DNS, and firewalls.
  • Kubernetes exposure: nodes, pods, services, and logs.
  • GPU awareness: basic diagnostics such as nvidia-smi.
  • Observability foundations: dashboards and alerts.
  • Scripting and automation: Bash or Python snippets.
  • Git for version control.
  • Cloud and virtualisation basics.

Responsibilities

  • Join the Support duty rotation and handle day‑to‑day tickets and alerts.
  • Manage and resolve tickets whilst keeping all parties informed.
  • Follow established runbooks to resolve common issues.
  • Learn platform fundamentals to support customers.
  • Participate in monitoring, troubleshooting, and triage.
  • Deliver assigned tasks and project work.
  • Share knowledge via documentation and training materials.
  • Take part in incident reviews.
  • Identify areas for automation.
  • Participate in on‑call or out‑of‑hours work.
View Full Description & ApplyYou'll be redirected to the employer's site
View details
Apply Now