Apply

Infrastructure Engineer (Compute)

Posted 2 days agoViewed

View full description

💎 Seniority level: Senior, 5+ years

🔍 Industry: Software Development

🏢 Company: FluidStack👥 11-50💰 Private 8 months agoPrivate CloudCloud ComputingMachine LearningGenerative AIInformation TechnologySmall and Medium BusinessesCloud StorageSoftwareGPU

⏳ Experience: 5+ years

Requirements:
  • 5+ years of experience in compute infrastructure engineering.
  • Strong knowledge of Linux systems administration and performance tuning.
  • Experience with bare metal provisioning tools (MaaS, Metal3, Tinkerbell, or other).
  • Familiarity with GPU hardware and workload optimization, especially kernel and driver level requirements.
  • Proficiency in automation tools (e.g., Ansible, Terraform).
  • Experience operating Kubernetes and SLURM clusters.
Responsibilities:
  • Design and implement GPU/ASIC infrastructure at the server, rack, and system level.
  • Troubleshoot complex GPU and compute system related failures.
  • Develop and maintain hardware/firmware management services.
  • Automate all aspects of the server lifecycle.
  • Own end-to-end compute lifecycle, including partnering with vendors on RMAs.
  • Serve as the main point of contact for hardware escalation and troubleshooting.
  • Monitor system performance, identifying and resolving bottlenecks.
  • Automate deployment and management tasks to improve efficiency.
  • Collaborate with storage and network teams to ensure cohesive infrastructure operations.
Apply

Related Articles

Posted about 1 month ago

How to Overcome Burnout While Working Remotely: Practical Strategies for Recovery

Burnout is a silent epidemic among remote workers. The blurred lines between work and home life, coupled with the pressure to always be “on,” can leave even the most dedicated professionals feeling drained. But burnout doesn’t have to define your remote work experience. With the right strategies, you can recover, recharge, and prevent future episodes. Here’s how.



Posted 7 days ago

Top 10 Skills to Become a Successful Remote Worker by 2025

Remote work is here to stay, and by 2025, the competition for remote jobs will be tougher than ever. To stand out, you need more than just basic skills. Employers want people who can adapt, communicate well, and stay productive without constant supervision. Here’s a simple guide to the top 10 skills that will make you a top candidate for remote jobs in the near future.

Posted 9 months ago

Google is gearing up to expand its remote job listings, promising more opportunities across various departments and regions. Find out how this move can benefit job seekers and impact the market.

Posted 10 months ago

Read about the recent updates in remote work policies by major companies, the latest tools enhancing remote work productivity, and predictive statistics for remote work in 2024.

Posted 10 months ago

In-depth analysis of the tech layoffs in 2024, covering the reasons behind the layoffs, comparisons to previous years, immediate impacts, statistics, and the influence on the remote job market. Discover how startups and large tech companies are adapting, and learn strategies for navigating the new dynamics of the remote job market.