Apply

Senior Infrastructure Engineer

Posted 2024-10-21

View full description

πŸ’Ž Seniority level: Senior, 3+ years in a similar role.

πŸ“ Location: Dubai, London

πŸ” Industry: Data Infrastructure

🏒 Company: Eqvilent

⏳ Experience: 3+ years in a similar role.

πŸͺ„ Skills: AWSDockerPythonApache AirflowApache HadoopBashHadoopJenkinsKafkaKubernetesAirflowApache KafkaGrafanaPrometheusCollaborationCI/CDTerraform

Requirements:
  • 3+ years in a similar role.
  • Proven experience with AWS or other cloud providers.
  • Experience with distributed systems (e.g. Apache Kafka, Apache Airflow, Apache Hadoop).
  • Proficiency with Terraform.
  • Extensive experience with Docker and Kubernetes, including cluster setup, node pools, and Helm charts.
  • Experience with CI/CD tools (e.g. GitLab CI, Jenkins).
  • Familiarity with observability tools such as Prometheus, Grafana, ELK stack.
  • Solid understanding of networking, security, and system architecture.
  • Strong scripting skills (e.g., Python, Bash).
  • Excellent problem-solving skills, communication, and collaboration abilities.
Responsibilities:
  • Design, implement, and maintain both cloud and on-premise compute and storage infrastructure.
  • Set up and manage Kubernetes clusters, implement Helm charts, ensuring high availability and performance.
  • Set up, maintain, and scale distributed systems (e.g. Apache Kafka, Apache Airflow) ensuring data integrity and security.
  • Automate code delivery processes and implement CI/CD, monitoring, logging, and alerting solutions.
  • Collaborate with development and operations teams, provide production support, and participate in on-call rotations.
Apply

Related Jobs

Apply

πŸ“ United States, Canada

πŸ” Digital sports media

  • Experience with infrastructure engineering.
  • Proficient in using Terraform.
  • Passion for performance, delivery, and metrics.

  • Collaborate closely with multiple engineering teams.
  • Help develop and improve internal tools and services.
  • Implement new infrastructure using Terraform.
  • Define scaling, alerting, and monitoring for systems.

DevOpsTerraform

Posted 2024-11-16
Apply
Apply

πŸ“ Romania

🧭 Full-Time

πŸ” Analytics engineering

🏒 Company: dbt Labs

  • Experience with AWS, Azure, or GCP, Terraform, Kubernetes, Python, and Bash.
  • Solid experience with declarative Infrastructure as Code, ideally with Terraform or a willingness to learn.
  • Experience working asynchronously in a fully-remote, distributed team.
  • Excellent communication and writing skills.

  • Design, operate, and support infrastructure systems with parity across tenancy models (single vs multi) and public clouds (AWS and Azure).
  • Work with engineering teams to consistently deploy their services to those environments.
  • Help create a great developer experience collaborating with Architecture, SRE, Release Engineering, and Security teams.
  • Participate in a balanced on-call rotation and help upgrade tooling to reduce toil.

AWSPythonBashKubernetesAzureTerraform

Posted 2024-11-14
Apply
Apply

πŸ“ Latin America

🧭 Full-Time

πŸ” FinTech

🏒 Company: Flex

  • Proven experience in building, scaling, and monitoring cloud infrastructure on AWS, especially EKS, S3, RDS, and other mentioned services.
  • Proven experience using Terraform to update and maintain cloud infrastructure.
  • Proven experience with containerized applications, Kubernetes, and microservice deployments.
  • Strong knowledge of GitHub Actions and CI/CD best practices.
  • Experience with developer productivity tools and creating self-service solutions.
  • Knowledge of monitoring and observability tools, specifically mentioning Datadog as a plus.
  • Familiarity with networking concepts such as DNS, load balancing, firewalls, and VPNs.
  • Strong collaboration skills and ability to communicate technical ideas clearly.
  • Experience coding/reading in Java, Python, or TypeScript.

  • Collaborate with service engineering teams to design, implement, and maintain scalable and resilient infrastructure solutions optimizing for performance, resilience, and cost.
  • Ensure infrastructure aligns with business requirements and industry standards.
  • Leverage Terraform to automate infrastructure provisioning and configurations.
  • Implement SRE principles to improve system reliability and reduce downtime.
  • Improve developer workflows by creating self-service tools, optimizing CI/CD pipelines, and enhancing deployment processes.
  • Develop and maintain robust monitoring and alerting systems to proactively identify and resolve issues.
  • Lead incident responses, manage on-call rotations, and facilitate post-incident reviews.
  • Automate everythingβ€”drive adoption of Infrastructure as Code (IaC) and build automated pipelines.

AWSPythonSoftware DevelopmentDynamoDBGCPJavaKubernetesTypeScriptCommunication SkillsCollaborationCI/CDTerraform

Posted 2024-11-07
Apply
Apply

πŸ“ United States

πŸ” Software

🏒 Company: Hashgraph

  • 8+ years of experience monitoring, deploying, and patching Linux-based server infrastructure.
  • 5+ years of experience managing and maintaining bare metal Kubernetes clusters.
  • 5+ years of experience managing, maintaining, and deploying mission-critical network & server infrastructure.
  • 3+ years of experience with Zero-Trust Privileged Access Management solutions such as Teleport or StrongDM.
  • 3+ years of experience writing network infrastructure documentation including diagrams and policies.
  • 2+ years of experience with network infrastructure design and architecture.
  • 2+ years of experience with IPv4/IPv6 address space planning and management.
  • 2+ years of experience planning and deployment of routing protocols (eg: OSPF, BGP).
  • 2+ years of experience planning and deploying bare metal configuration management solutions.
  • 2+ years of experience writing effective design and process documentation.
  • Self-motivated with excellent communication, organizational, and leadership skills.
  • Experience in Iterative and Incremental Engineering Practices.
  • Cisco CCNP, CCDE, Juniper JNCIA-ENT, or JNCIA-DC certifications recommended.
  • Bachelor’s degree in Computer Science or equivalent work experience.

  • Managing, monitoring, and maintaining a fleet of 170+ bare metal servers.
  • Developing and maintaining Kubernetes clusters for CI/CD workflows and release management automation.
  • Ensuring the integrity and security of bare metal server deployments.
  • Developing and maintaining a scalable network and server infrastructure.
  • Maintaining network/server automation tools for the bare metal fleet.
  • Collaborating with IT and Security stakeholders for security auditing and threat mitigation.
  • Working with DevOps and Software Engineering stakeholders to align strategies and execution.
  • Ensuring product releases meet business goals.

LeadershipCiscoKubernetesStrategyRelease ManagementCI/CDLinuxDevOpsDocumentation

Posted 2024-11-07
Apply
Apply

πŸ“ Georgia

🧭 Full-Time

πŸ” Integration and automation software

🏒 Company: Workato

  • 7+ years of professional experience in hands-on engineering roles (DevOps/SRE), with a BS or MS in Computer Science (or equivalent).
  • 1+ year of experience with hosting AI models (ML flow, AWS Sagemaker, Azure AI, Kubernetes).
  • 1+ year of experience with ML Ops (ML flow, vector databases, dagster).
  • Strong experience managing Kubernetes clusters and workloads using EKS.
  • Proficiency in Python; knowledge of Go, Ruby, or JavaScript is a plus.
  • Experience with CI/CD tools like GitHub Actions or GitLab CI.
  • Expertise in deploying Kubernetes-based services using Kustomize, Helm, and GitOps tools.
  • Hands-on experience with AWS architectures, networking fundamentals, and web services.
  • Experience using Infrastructure as Code tools like Terraform.
  • Knowledge of container technologies and best practices.

  • As a Senior Infrastructure Engineer, you will be responsible for deploying, scaling, and maintaining services at the ML/AI team.
  • You will work closely with ML Engineers and Data Scientists as part of a flexible team.
  • Your role will have a direct impact on the modernization and maturation of the platform, including infrastructure architecture decisions.

DevOps

Posted 2024-11-07
Apply
Apply

πŸ“ USA, UK, Germany, France, Canada, India, Chile

🧭 Full-Time

πŸ” Automation

🏒 Company: Make

  • At least 5 years of experience in managing and operating Linux/Unix-based infrastructure.
  • Knowledge of at least one cloud provider, ideally AWS.
  • Day-to-day experience with a container orchestration platform, preferably Kubernetes.
  • Proficiency in Infrastructure as Code practices and tools such as Terraform.
  • Hands-on experience with CI/CD tools and various deployment strategies.
  • Understanding of Service Level Indicators, Objectives, and Agreements.
  • Effective communication skills in English.
  • Openness to knowledge sharing and mentoring.
  • Experience with troubleshooting and debugging issues.
  • Working knowledge of programming/scripting languages like Python or Go.

  • Design, build, and maintain a scalable & resilient infrastructure on AWS.
  • Follow the Infrastructure as Code principle to keep changes versioned.
  • Build and manage cloud infrastructure using Terraform.
  • Continuously evolve & maintain Kubernetes clusters.
  • Implement and consult on observability and monitoring framework.
  • Share knowledge in technologies like Kubernetes, Docker, and more.
  • Contribute to service blueprints.
  • Actively test and evolve system reliability.
  • Cooperate with Security on infrastructure compliance.
  • Design and support continuous deployment tooling.
  • Be on-call for incidents affecting availability.
  • Debug production issues across services.

AWSDockerNode.jsPostgreSQLPythonElasticSearchKubernetesRabbitmqElasticsearchGoPostgresRedisCommunication SkillsCI/CDTerraform

Posted 2024-11-07
Apply
Apply

πŸ“ United States

🏒 Company: AssistRx

  • 7+ years of experience as a Linux Engineer or Systems Administrator in a production environment.
  • Proven experience with Linux systems, virtualization technologies, and cloud platforms.
  • Deep knowledge of Linux operating systems, especially RedHat and CentOS.
  • Strong knowledge of networking technologies and devices.
  • Proficiency in scripting tools like Bash, Python, and PowerShell.
  • Knowledge of configuration management and Infrastructure as Code practices.

  • Design and deploy Linux-based infrastructure to support business applications, ensuring scalability, performance, and security.
  • Manage and monitor Linux and Windows servers, ensuring optimal performance and uptime.
  • Implement and maintain security measures across the environment.
  • Collaborate with cross-functional teams to integrate Linux/Windows systems.
  • Provide advanced technical support for production issues.
  • Mentor junior team members and document infrastructure designs.

LeadershipPythonBashCloud Computing*NixCommunication SkillsAnalytical SkillsCollaboration

Posted 2024-10-25
Apply
Apply

πŸ“ EMEA

🧭 Full-Time

πŸ” Database and Software Solutions

🏒 Company: MongoDB

  • Pragmatic, detail-oriented, self-motivated, and understands the benefits of collaboration.
  • Provides guidance and coaching to entry-mid level engineers.
  • Takes a software-driven approach to solving problems and routinely uses git to track progress.
  • Familiar with software engineering principles, dependency injection, composition, and test driven development.
  • Experience designing/implementing medium/large scale software projects (preferably with Go).
  • Familiar with standard authentication protocols (e.g OAuth).
  • Familiar with the development of web services and/or Kubernetes controllers.
  • Experienced performing deep technical analysis and fixing applications, systems, and networks.
  • Strong Linux and TCP/IP networking skills.
  • Solid knowledge of cloud infrastructure.
  • Experience with configuration management tools and managing infrastructure through code.
  • Familiar with how to use CI/CD workflows and tooling to deploy production services.
  • Experience running containers in a production environment, preferably Kubernetes based.
  • Experience with observability concepts and tooling, metrics, logging, traces, Prometheus, Grafana, OpenTelemetry.
  • Has practical knowledge of delivering production level services with SLI/SLOs and understands how to measure, track and adjust them.

  • Work with engineering teams across MongoDB to investigate gaps and limitations in existing development workflows and understand new infrastructure and platform requirements.
  • Design self-service platform services and developer tooling that focuses on reliability, usability, and provides the appropriate level of abstraction from cloud infrastructure.
  • Regularly write and review automation, configuration management, and application code.
  • Author and review functional specifications and scoping documents for large platform projects and services.
  • Own and operate much of the internal development platform that runs MongoDB.
  • Work on a distributed team that frequently interacts with remote engineers across multiple time zones.

AWSGitKubernetesMicrosoft AzureOAuthAzureGoGrafanaPrometheusCollaborationCI/CDLinux

Posted 2024-10-20
Apply
Apply

πŸ“ Canada

πŸ” AI-powered Fraud and Risk Platform

🏒 Company: DataVisor

  • BS in computer science or computer engineering required.
  • Solid background in computer systems including operating system, networking, database, and distributed systems.
  • Experience in hyper-scalable infrastructure in the past is a strong plus.
  • 5+ years of industry experience.
  • 5+ years programming experience Java/Go/Python.
  • 1+ years experience with Kubernetes.
  • 3+ years experience with big data infrastructure such as Spark, NoSQL database, real-time streaming pipeline.
  • Preferred: contribution to open source big data infrastructure.
  • Preferred: familiar with cloud platforms such as AWS, GCP, Azure, or Alibaba Cloud.
  • Past Site Reliability Engineering experience in the past is a plus too.
  • Have a strong culture match with a fast-growing, agile startup environment: passionate, collaborative, fast iteration, and hardworking.

  • We are looking for a senior level infrastructure developer with strong background and expertise in building and supporting hyper scalable infrastructure to join the team.
  • DataVisor supports a wide variety of clouds including AWS, Azure, GCP, AliCloud, as well as some of the largest on-premise environments, all on top of Kubernetes with the latest big data and parallel computing technology.
  • The ideal candidate will be an excellent domain expert in this area, and have strong passion in combining the most advanced machine learning technology, including unsupervised machine learning, with hyper scalable computation infrastructure to make the next generation solution that supports billions of real time sophisticated feature computation and unsupervised learning decisions.
  • To be successful in this role, you should have solid experience working with different technical teams in understanding different technologies and computation requirements, as well as interacting with business teams to understand product vision and value.

AWSPythonAgileGCPJavaKubernetesMachine LearningAzureGoNosqlSpark

Posted 2024-10-15
Apply
Apply

πŸ“ United States

🧭 Full-Time

πŸ’Έ 83000 - 125000 USD per year

πŸ” IT Infrastructure

🏒 Company: Sinch

  • Technical expert with excellent communication skills and a methodical work approach.
  • At least 5 years of professional experience in IT, specifically in IT infrastructure and engineering.
  • Experience with IT infrastructure, networks, applications engineering, Linux server, database & cloud engineering, Windows server, and IT security standards.
  • Relevant infrastructure certifications (e.g., CCNA, CCNP) and familiarity with Cisco Meraki, Cisco AnyConnect, and Infrastructure as code are a plus.

  • Pro-actively contribute to architecture strategy and principles with the Enterprise Architecture Team.
  • Administrate and configure system and infrastructure technologies such as VPN, Office Network, Linux/Windows servers, and cloud infrastructure (AWS, Azure).
  • Consult application owners and business on infrastructure-related requirements and provide training to system users.
  • Provide 2nd and 3rd level support for system and infrastructure issues while participating in the on-call rotation.

AWSStrategyAzureCommunication Skills

Posted 2024-10-15
Apply