Infrastructure Engineer

New
GermanyFull-TimeMiddle
Salary not disclosed
Apply NowOpens the employer's application page

Job Details

Experience
5+ years
Required Skills
KubernetesLinuxTerraformNetworkingAnsible

Requirements

  • 5+ years of experience in bare-metal, HPC/GPU, data-center, or systems infrastructure engineering.
  • Hands-on ownership of physical and compute infrastructure.
  • Strong proficiency in bare-metal Linux (RHEL, Rocky, Ubuntu) including firmware, BMC, PXE, and kernel/storage tuning.
  • Solid understanding of networking and storage fundamentals.
  • Demonstrable experience with the NVIDIA GPU stack (drivers, CUDA, GPU Operator, MIG, DCGM) in production.
  • Experience serving GPU models in production environments.
  • Comfortable operating in air-gapped or on-prem environments.
  • Ability and willingness to travel to customer sites for builds and deployments.
  • Methodical approach to hardware incidents and documentation.

Responsibilities

  • Design, size, provision, and operate bare-metal GPU server fleets across on-prem and air-gapped environments.
  • Manage the NVIDIA GPU stack end to end, including drivers, CUDA, GPU Operator, MIG, and DCGM.
  • Build the bare-metal substrate for Kubernetes, including node lifecycle, container runtimes, and kernel/NUMA tuning.
  • Engineer data-center networking and resilient storage systems such as Ceph, ZFS, and NVMe.
  • Collaborate with ML and MLOps teams on on-prem inference serving using tools like Triton, KServe, and vLLM.
  • Plan and execute on-site hardware build-outs, including rack integration, cooling/power sizing, and operator handovers.
View Full Description & ApplyYou'll be redirected to the employer's site
View details
Apply Now