Infrastructure Engineer

Posted 3 days agoViewed

🏢 Company: Paradigm

Requirements:

Proven experience maintaining and scaling bare metal servers and cloud environments for production systems

Proficient at building tooling and scripts using Rust, Go or Python

Deep expertise deploying Kubernetes within production environments and working with IaC and configuration management tools like Terraform, Helm and ArgoCD

Skilled at deploying monitoring, alerting and observability systems (e.g., Prometheus, Grafana), securing and hardening those systems, and troubleshooting issues with engineers

Knowledgeable about Linux and networking, and troubleshooting on Linux systems

Familiarity with blockchain infrastructure, particularly the Ethereum ecosystem

Responsibilities:

Implement and manage the infrastructure that allows the engineering team to ship quickly and effectively

Proactively identify and eliminate bottlenecks in the devops process to ensure optimal developer velocity

Apply

Related Jobs

Apply

🔥 Lead Infrastructure Engineer – Data Systems

Posted 1 day ago

📍 United States

🧭 Full-Time

🔍 Advertising Software

🏢 Company: MNTN👥 251-500💰 $2,000,000 Seed over 2 years agoAdvertising Real Time Marketing Software

🔧 Requirements

8+ years in infrastructure engineering or systems administration, with increasing scope and leadership.
Demonstrated experience tuning Linux kernel settings for disk and network performance.
Deep experience with virtualized environments (multiple hypervisors).
Proven ability to support large-scale SAAS infrastructure and large database clusters.
Strong scripting and automation skills in Python and Bash.
Familiarity with storage technologies, particularly iSCSI and network-based storage.
Understanding of core networking concepts, including layer 3 routing and TCP/IP fundamentals.
Experience with Ansible or similar configuration management tools.
Strong documentation skills and operational discipline.
Ability to travel on-site twice per year.

💡 Responsibilities

Architect and implement high-performance data warehousing infrastructure in collaboration with Data Engineering.
Tune Linux kernel parameters for optimal disk and network throughput—e.g., adjusting block sizes, optimizing IOPS, striping.
Design and support hybrid infrastructure solutions that combine colocated servers and cloud platforms.
Lead automation efforts using Ansible and scripting (Python, Bash) to configure, deploy, and maintain server clusters.
Own the performance and scalability of systems supporting large-scale database clusters (e.g., Postgres, MySQL, Oracle).
Define templates and standards for infrastructure deployment and management.
Drive ongoing performance improvements across the infrastructure stack.
Manage all aspects of data center operations including rack layout, IP planning, and hardware logistics.
Establish robust monitoring and alerting for all infrastructure components.

AWSPostgreSQLPythonSQLBashKubernetesMySQLOracleData engineeringRDBMSCI/CDLinuxDevOpsTerraformNetworkingAnsibleScripting

Posted 1 day ago

Apply

🔥 Data Infrastructure Engineer

Posted 1 day ago

📍 United States

🔍 Software Development

🏢 Company: Worth AI👥 11-50💰 $12,000,000 Seed over 1 year agoArtificial Intelligence (AI)Business Intelligence Risk Management FinTech

🔧 Requirements

Strong programming skills in Python, Node.js.
Proficient in SQL and experience with distributed query engines (e.g., Trino, Presto).
Experience with cloud-native data platforms such as AWS Glue
Hands-on experience with infrastructure-as-code tools (Terraform, Pulumi, CloudFormation).
Familiarity with containerization and orchestration tools such as Kafka and Kubernetes.
Solid understanding of data governance, quality frameworks, and data lifecycle management.
Experience in streaming data architecture and tools like Apache Kafka, Kinesis, or Pub/Sub.
Background in supporting machine learning or analytics platforms.
Exposure to data mesh, data contracts, or modern data stack concepts.
Knowledge of DevOps principles applied to data systems.

💡 Responsibilities

Design, build, and maintain scalable and resilient data infrastructure in a cloud environment (AWS, Azure, or GCP).
Develop and maintain ETL/ELT pipelines using orchestration tools such as Airflow, Dagster, or dbt.
Optimize data workflows for reliability, performance, and cost efficiency across structured and unstructured datasets.
Manage data lake and data warehouse environments (e.g., Snowflake, BigQuery, Redshift, Delta Lake).
Ensure data security, privacy, and compliance, including role-based access control, data encryption, and audit logging.
Collaborate with data scientists, analysts, and product teams to ensure data accessibility, accuracy, and availability.
Support real-time and batch data processing frameworks, including Kafka, Spark, Flink, or similar tools.
Monitor, troubleshoot, and improve the observability and performance of data systems using tools like Prometheus, Grafana, or Datadog.
Maintain CI/CD pipelines for data infrastructure using Terraform, GitHub Actions, or similar tools.

Posted 1 day ago

Apply

🔥 LatAm - Cloud Infrastructure Engineer - E-Learning

Posted 2 days ago

🧭 Part-Time

🔍 E-Learning

🏢 Company: Truelogic👥 101-250 Consulting Web Development Web Design Software

🔧 Requirements

3+ years of hands-on experience with AWS cloud infrastructure, particularly ECS Fargate, Lambda, DynamoDB, RDS, S3, CloudFront, and VPC configuration
Strong proficiency in Python web framework
Extensive experience with Docker, containerization, and container orchestration
Working knowledge of uWSGI, Nginx, and web server configuration
Familiarity with Linux system administration and shell scripting
Experience with infrastructure as code tools (CloudFormation preferred)
Understanding of networking concepts, security best practices, and performance optimization
Ability to manage multiple technical priorities and communicate clearly about complex systems
Self-motivated with a proactive approach to problem-solving
Comfort working in a part-time capacity while delivering high-impact results

💡 Responsibilities

Manage and optimize our AWS cloud infrastructure (ECS Fargate, Lambda, DynamoDB, S3, CloudFront, Aurora RDS, etc.)
Monitor and troubleshoot container deployments, ensuring high availability and performance
Implement and improve CI/CD pipelines for automated testing and deployment
Maintain security best practices and compliance across our infrastructure
Optimize costs while maintaining performance and reliability
Support and extend our Python Flask application architecture
Integrate and configure services including uWSGI, Nginx, and Redis
Manage DynamoDB and Aurora RDS tables and data workflows
Develop and maintain Lambda functions for various processing tasks
Work with Docker containers and containerization strategies
Diagnose and resolve technical issues across our development and production environments
Perform system upgrades and patches with minimal service disruption
Document technical processes, architecture decisions, and system configurations
Participate in on-call rotation for critical system support (as needed)
Recommend and implement architecture improvements based on evolving requirements
Research and evaluate new technologies and services that could benefit our infrastructure
Collaborate with the team to establish best practices for code quality and infrastructure management

Posted 2 days ago

Apply

🔥 Infrastructure Engineer (Compute)

Posted 2 days ago

🧭 Full-Time

🔍 Software Development

🏢 Company: FluidStack👥 11-50💰 Private 8 months agoPrivate Cloud Cloud Computing Machine Learning Generative AI Information Technology Small and Medium Businesses Cloud Storage Software GPU

🔧 Requirements

5+ years of experience in compute infrastructure engineering.
Strong knowledge of Linux systems administration and performance tuning.
Experience with bare metal provisioning tools (MaaS, Metal3, Tinkerbell, or other).
Familiarity with GPU hardware and workload optimization, especially kernel and driver level requirements.
Proficiency in automation tools (e.g., Ansible, Terraform).
Experience operating Kubernetes and SLURM clusters.

💡 Responsibilities

Design and implement GPU/ASIC infrastructure at the server, rack, and system level.
Troubleshoot complex GPU and compute system related failures.
Develop and maintain hardware/firmware management services.
Automate all aspects of the server lifecycle.
Own end-to-end compute lifecycle, including partnering with vendors on RMAs.
Serve as the main point of contact for hardware escalation and troubleshooting.
Monitor system performance, identifying and resolving bottlenecks.
Automate deployment and management tasks to improve efficiency.
Collaborate with storage and network teams to ensure cohesive infrastructure operations.

Posted 2 days ago

Apply

🔥 Senior Platform Infrastructure Engineer

Posted 3 days ago

🧭 Full-Time

💸 145000.0 - 195000.0 USD per year

🔍 Software Development

🏢 Company: Cavnue👥 101-250💰 $130,000,000 Series A about 3 years agoInformation Services Autonomous Vehicles Software

🔧 Requirements

5+ years of hands-on experience in infrastructure engineering, DevOps, or SRE roles, with a track record of operating production cloud environments at scale.
Strong experience using Terraform for infrastructure provisioning and configuration management in cloud environments.
Proficiency in multi-cloud operations – Google Cloud Platform (GCP) is highly preferred; experience with Amazon Web Services (AWS) and/or Microsoft Azure is also acceptable.
Deep understanding of Kubernetes (required), including experience setting up and managing Kubernetes clusters, deploying containerized applications, and debugging cluster and networking issues.
Ability to write clean, maintainable code for automation and tooling in Python and/or Golang.
Familiarity with basic networking concepts and protocols (TCP/IP, DNS, load balancing, VLANs/VPCs, firewalls) and how they apply in cloud and hybrid environments.
Willingness to take part in on-call rotations and proven skills in troubleshooting and resolving infrastructure incidents under pressure.
Strong hands-on skills with Linux and command-line tools; you are comfortable using terminals and utilities (e.g. k9s for Kubernetes, tmux sessions, zsh or similar shells) to manage and debug systems efficiently.
Knowledge of zero trust architecture principles and a habit of incorporating security best practices into infrastructure design (formal security certifications are not required).
Excellent communication skills with the ability to work cross-functionally. You can collaborate in a fast-paced engineering organization, explain complex infrastructure concepts to team members, and contribute to a positive engineering culture.

💡 Responsibilities

Design and implement cloud and edge infrastructure
Use Terraform to provision and manage infrastructure resources consistently across multiple cloud providers (GCP preferred, with AWS/Azure as needed), enabling reproducible and auditable infrastructure changes.
Deploy, administer, and optimize Kubernetes clusters for containerized workloads. Handle cluster upgrades, scaling, monitoring, and troubleshoot complex issues in production Kubernetes environments.
Develop robust automation scripts and internal tools/services in Python and/or Golang to automate routine tasks, integrate systems, and improve operational efficiency across the infrastructure.
Implement monitoring, logging, and alerting solutions to track system performance and reliability. Proactively tune systems and address bottlenecks to maintain smooth operation of critical services.
Embed security best practices into the infrastructure, enforcing zero trust architecture principles (e.g. least privilege, identity-based access) to protect systems and data. Work closely with security teams to remediate vulnerabilities and ensure compliance with company policies.
Participate in an on-call rotation during the team’s initial growth phase, quickly responding to infrastructure incidents and leading efforts to restore service and perform root cause analysis.
Work closely with all teams to understand application needs and translate them into scalable infrastructure solutions. Communicate clearly across teams and document designs and processes for broad understanding.
Stay up to date with emerging technologies and industry best practices in cloud infrastructure, DevOps, and platform engineering. Lead or contribute to infrastructure projects that enhance deployment speed, cost efficiency, and overall platform reliability.

Posted 3 days ago

Apply

🔥 Infrastructure Engineer: Cloud

Posted 4 days ago

🧭 Full-Time

🔍 Software Development

🏢 Company: Clickatell

🔧 Requirements

Related IT qualification / 5+ years in a system administrative position
Red Hat Enterprise Linux certified (RHCE or better) or other appropriate Linux/Unix certification (preferred)
Cloud certifications (AWS preferred)
Proven experience as a SysOps Engineer or similar role.
Experience in virtualisation and cloud environments such as Amazon Web Services (AWS) or similar.
Red Hat Enterprise Linux certified (RHCE or better) or other appropriate Linux/Unix certification advantageous
Perl, python, ruby and/or PHP scripting experience advantageous
Containerisation (Docker/Kubernetes) knowledge advantageous
Operating systems, from bare steel to network services
Containerisation (Docker/Kubernetes) knowledge
Use of CD/CI tools (Anisible/Puppet/Terraform) advantageous
Proven experience in and with a large, ISP-type environment and infrastructure advantageous
Monitoring and alerting experience with Open-Source technologies like Icinga/Nagios, Nagvis, Logstash, Elasticsearch, Graphite and Kibana Advantageous
Proven experience in production environments of the below is advantageous:SAN storage solutions
Software package building and release management with software tools such as Puppet, Chef or Salt
Network/OS clustering
Must understand and demonstrate knowledge of:Networking, from Ethernet to IP
Operating systems, from bare steel to network services
IP networks, including but not limited to working knowledge of DHCP, DNS, SMTP, FTP, HTTP
Minimum of 5 years in a system administrative position
Minimum of 3 years working experience with Unix or derivative
Minimum of 3 years working experience with troubleshooting hardware and/or software
Minimum of 3 years programming or scripting experience (advantageous)
Amazon Web Services and Virtualization technologies such as VMware

💡 Responsibilities

Be a thought leader with regards to Clickatell’s overall cloud adoption strategy.
Divisional policy and process formulation, strategic planning, resource coordination and operational execution of projects and assisting in procurement process.
Installation/configuration, operation, maintenance, and monitoring of the Clickatell messaging engine hardware, software, and related infrastructure with a focus on high availability, stability and security.
Work closely with software development teams to facilitate smooth integration of applications with cloud infrastructure.
Scripting and coding to automate routine tasks and improve operational efficiency.
Technical research and development to enable continuing innovation within the infrastructure
Provide technical support and troubleshooting for cloud-based infrastructure issues.
Ensuring network, hardware, operating systems, software applications and any related procedures adhere to organizational values, enabling staff, customers, and partners
Technical liaison with enterprise customers and vendors as required Service, maintain, commission, and support global platforms, with a view towards high availability

Posted 4 days ago

Apply

🔥 Infrastructure Engineer (Storage, Hardware & Software)

Posted 4 days ago

🧭 Full-Time

🔍 Software Development

🏢 Company: FluidStack👥 11-50💰 Private 8 months agoPrivate Cloud Cloud Computing Machine Learning Generative AI Information Technology Small and Medium Businesses Cloud Storage Software GPU

🔧 Requirements

5+ years of experience in storage engineering, with a focus on high-performance environments.
Proficiency in storage protocols (NFS, S3) and technologies (RAID, ZFS).
Experience with storage hardware from major vendors (e.g. Weka, VAST, DDN) or open source tools (LUSTRE, Minio, etc.).
Strong scripting skills (e.g., Python, Bash) for automation and monitoring.
Familiarity with data center operations and infrastructure management.

💡 Responsibilities

Design and deploy scalable storage architectures (SAN, NAS, object storage) tailored for GPU-intensive workloads.
Implement and manage backup, replication, and disaster recovery strategies.
Monitor storage performance and capacity, optimizing for efficiency and reliability.
Collaborate with compute and network teams to ensure seamless integration and performance.
Evaluate and integrate emerging storage technologies to maintain cutting-edge infrastructure.

Posted 4 days ago

Apply

🔥 Staff ML Infrastructure Engineer

Posted 5 days ago

💸 190000.0 - 240000.0 USD per year

🔍 Software Development

🏢 Company: Engine

🔧 Requirements

Hands-on with TensorFlow Serving, TorchServe, or similar frameworks.
Build production-grade APIs and integrate model inference into application workflows.
Containerize and orchestrate inference services at scale.

💡 Responsibilities

Deploy and operate machine learning models optimized for low-latency, high-throughput inference in production environments.
Build and maintain clean gRPC interfaces to expose model predictions to upstream services.
Own the production code paths that deliver features to the model—writing maintainable, testable application logic that integrates cleanly with the broader system.

Posted 5 days ago

Apply

🔥 Lead Infrastructure Engineer

Posted 5 days ago

🧭 Full-Time

🔍 Software Development

🏢 Company: Integration App

🔧 Requirements

You’ve built and run cloud infrastructure at scale (AWS preferred).
You work fluently with IaC tools (Terraform, CDK, etc.) and container platforms (Docker, Kubernetes).
You’ve implemented observability and understand distributed systems debugging.
You care about security, reliability, and helping others ship faster.

💡 Responsibilities

Own our cloud infrastructure, primarily AWS—design for scale, reliability, and security.
Improve observability—build out logging, monitoring, and tracing to catch issues before users do.
Streamline deployments—refine CI/CD pipelines, speed up builds, and improve dev workflows.
Make things reliable and efficient—automate failover, improve uptime, reduce cloud spend.
Level up developer experience—make development experience for our team smooth, fast, and safe.
Lead infrastructure work—set direction, share best practices, and mentor others as we grow.

Posted 5 days ago

Apply

🔥 Cloud Infrastructure Engineer

Posted 5 days ago

📍 Europe, South Africa

🧭 Full-Time

🔍 Air Cargo

🏢 Company: cargo.one

🔧 Requirements

At least 2 years of experience as cloud infrastructure engineer with one of the major cloud providers.
A strong growth mindset.
Exceptional written and verbal skills with fluency in English

💡 Responsibilities

Operate our cloud infrastructure on GCP and Hetzner including multiple Kubernetes clusters (GKE, Rancher)
Run and maintain infrastructure components hosted within Kubernetes, for example Hashicorp Vault, redis and nginx-ingress
Keep track of what our infrastructure is doing through Grafana dashboards and alerts.
Assess security risks and actively increase security of our operations by thinking about refined approaches to authorization, network segmentation or encryption (during transport and at rest).
Identify and implement improvements across our infrastructure stack. More infrastructure as code, better alerting, reduction of cloud cost.

DockerPythonCloud ComputingGCPKubernetesGrafanaCI/CDLinuxDevOpsTerraformAnsible

Posted 5 days ago

Apply

How to Overcome Burnout While Working Remotely: Practical Strategies for Recovery

Posted about 1 month ago

Burnout is a silent epidemic among remote workers. The blurred lines between work and home life, coupled with the pressure to always be “on,” can leave even the most dedicated professionals feeling drained. But burnout doesn’t have to define your remote work experience. With the right strategies, you can recover, recharge, and prevent future episodes. Here’s how.

Top 10 Skills to Become a Successful Remote Worker by 2025

Posted 6 days ago

Remote work is here to stay, and by 2025, the competition for remote jobs will be tougher than ever. To stand out, you need more than just basic skills. Employers want people who can adapt, communicate well, and stay productive without constant supervision. Here’s a simple guide to the top 10 skills that will make you a top candidate for remote jobs in the near future.

Weekly Digest: Remote Jobs News and Trends (August 11 - August 18, 2024)

Posted 9 months ago

Google is gearing up to expand its remote job listings, promising more opportunities across various departments and regions. Find out how this move can benefit job seekers and impact the market.

Weekly Digest: Remote Jobs News and Trends (August 5 - August 11)

Posted 10 months ago

Read about the recent updates in remote work policies by major companies, the latest tools enhancing remote work productivity, and predictive statistics for remote work in 2024.

Tech Layoffs and the New Dynamics of Remote Jobs

Posted 10 months ago

In-depth analysis of the tech layoffs in 2024, covering the reasons behind the layoffs, comparisons to previous years, immediate impacts, statistics, and the influence on the remote job market. Discover how startups and large tech companies are adapting, and learn strategies for navigating the new dynamics of the remote job market.

Infrastructure Engineer

Requirements:

Responsibilities:

Related Jobs

Related Articles

How to Overcome Burnout While Working Remotely: Practical Strategies for Recovery

Top 10 Skills to Become a Successful Remote Worker by 2025

Weekly Digest: Remote Jobs News and Trends (August 11 - August 18, 2024)

Weekly Digest: Remote Jobs News and Trends (August 5 - August 11)

Tech Layoffs and the New Dynamics of Remote Jobs