Apply

Site Reliability Engineer - Data Platform

Posted 6 days agoInactiveViewed

View full description

💎 Seniority level: Senior, 5+ years

🔍 Industry: Software Development

🏢 Company: Kraken👥 1001-5000💰 Secondary Market about 1 year ago🫂 Last layoff 4 months agoEthereumBlockchainBitcoinFinTechTrading Platform

⏳ Experience: 5+ years

Requirements:
  • 5+ years working as a Site Reliability Engineer, Infrastructure Engineer, or similar roles, with a focus on data infrastructure and security.
  • Experience with real-time data processing technologies, such as Kafka and Debezium
  • Working experience in managing hybrid systems particularly AWS and (HashiCorp nice to have).
  • Infrastructure as Code tools such as Terraform, Terragrunt and Atlantis
  • Experience with containerization and orchestration tools, particularly Kubernetes and Docker
  • Solid understanding of bash/shell scripting and proficiency in at least one programming language (preferably Python or Rust).
  • Familiarity with CI/CD deployment pipelines and related tools.
  • Strong problem-solving skills and the ability to troubleshoot complex systems.
Responsibilities:
  • Design the data governance mechanisms that ensure our lakehouse is easy to interact with, secure and in compliance with all applicable regulations.
  • Implement the infrastructure we use to ingest our data, store it, catalog it with the right metadata and capture its lineage.
  • Provide a state-of-the-art suite of BI tools for multiple teams within the company.
  • Guarantee the availability, high performance, scalability and cost efficiency of our data platform.
  • Implement data infrastructure solutions (self service) that support the needs of 10+ business units and over 100 engineering and data analysts
  • Utilize Infrastructure as Code (IaC) principles to design, provision, and manage both on-premises and cloud (AWS) infrastructure components using tools such as Terraform
  • Develop and maintain automation scripts using bash/shell scripting and to automate operational tasks and deployments.
  • Enhance and manage CI/CD pipelines to facilitate consistent software deployments across the data infrastructure.
  • Implement robust data monitoring and alerting solutions to proactively detect anomalies and performance issues.
  • Manage and implement role-based access control (RBAC) and permissions for a multitude of user groups and machine workflows across different environments
  • Manage and maintain real-time streaming data architecture using technologies like Kafka and Debezium Change Data Capture (CDC).
  • Ensure the timely and accurate processing of streaming data, enabling data analysts and engineers to gain insights from up-to-date information.
  • Utilize Kubernetes to manage containerized applications within the data infrastructure, ensuring efficient deployment, scaling, and orchestration.
  • Implement effective incident response procedures and participate in on-call rotations.
  • Collaborate with data analysts, engineers, and cross-functional teams to understand requirements and implement appropriate solutions.
  • Document architecture, processes, and best practices to enable knowledge sharing and support continuous improvement.
  • Support AI/ML teams with their infra requests
Apply

Related Articles

Posted 14 days ago

Why remote work is such a nice opportunity?

Why is remote work so nice? Let's try to see!

Posted 7 months ago

Insights into the evolving landscape of remote work in 2024 reveal the importance of certifications and continuous learning. This article breaks down emerging trends, sought-after certifications, and provides practical solutions for enhancing your employability and expertise. What skills will be essential for remote job seekers, and how can you navigate this dynamic market to secure your dream role?

Posted 7 months ago

Explore the challenges and strategies of maintaining work-life balance while working remotely. Learn about unique aspects of remote work, associated challenges, historical context, and effective strategies to separate work and personal life.

Posted 7 months ago

Google is gearing up to expand its remote job listings, promising more opportunities across various departments and regions. Find out how this move can benefit job seekers and impact the market.

Posted 7 months ago

Learn about the importance of pre-onboarding preparation for remote employees, including checklist creation, documentation, tools and equipment setup, communication plans, and feedback strategies. Discover how proactive pre-onboarding can enhance job performance, increase retention rates, and foster a sense of belonging from day one.