Apply

Senior Platform Engineer - Portugal (Contractor)

Posted 6 days agoViewed

View full description

💎 Seniority level: Senior, 6+ years

📍 Location: UK, Ireland, Israel, Estonia, Spain, Portugal, other East Europe locations

🔍 Industry: Software Development

🏢 Company: DoiT👥 501-1000💰 $100,000,000 Series A over 5 years agoInternet of ThingsBig DataCloud ComputingRoboticsAnalyticsInformation Technology

🗣️ Languages: English

⏳ Experience: 6+ years

🪄 Skills: AWSBashGCPKubernetesTypeScriptGoGrafanaPrometheusCI/CDRESTful APIsLinuxDevOpsTerraformMicroservicesNetworkingScripting

Requirements:
  • 6+ years of proven experience in platform engineering, DevOps engineering, or related roles, with a strong track record of building and maintaining complex cloud infrastructure.
  • Strong hands-on experience with AWS/GCP, Kubernetes (EKS/GKE), and Terraform.
  • Demonstrated expertise in building and maintaining scalable, reliable, and secure cloud infrastructure, with a focus on automation and efficiency.
  • Strong coding skills in Go or Typescript, or other relevant languages.
  • Proven experience with CI/CD tools, such as Argo CD, Atlantis, or similar technologies, and a deep understanding of CI/CD principles and best practices.
  • Understanding of networking concepts and protocols.
  • Extensive experience with monitoring and logging tools, such as Prometheus, Grafana, and the ELK stack, and a proven ability to use these tools to diagnose and resolve performance issues.
  • Knowledge of security best practices for cloud environments.
  • Excellent communication skills in English, both written and verbal.
  • Self-organized, goal-oriented, and self-motivated.
  • Ability to work effectively in a remote and distributed team environment.
  • Prior experience working specifically on platform engineering projects.
Responsibilities:
  • Function as an individual contributor within the team: actively collaborating with peers through thorough code reviews, providing constructive support and mentorship, and contributing to a unified technical direction for the platform. This role also requires collaboration with individuals in feature teams, providing them with support and working with them to facilitate the adoption of developed platform features.
  • Architect, Design, and Implement Infrastructure as Code (IaC) using Terraform: You will be responsible for the comprehensive lifecycle management of our infrastructure through Terraform. This involves designing modular and reusable Terraform configurations, managing state effectively, implementing robust testing strategies, and ensuring that our infrastructure is consistently provisioned and managed in a predictable and repeatable manner.
  • Deploy, Manage, and Optimize Kubernetes Clusters on AWS (EKS) and GCP (GKE): You will take ownership of the deployment, configuration, and ongoing maintenance of our Kubernetes clusters on AWS Elastic Kubernetes Service (EKS) and GCP Google Kubernetes Engine (GKE). This includes managing node groups, configuring network policies, implementing service meshes, handling cluster upgrades, and ensuring high availability and fault tolerance. You will also be responsible for monitoring cluster health, performance, and resource utilization, and proactively addressing any issues that arise.
  • Develop and Maintain Sophisticated CI/CD Pipelines for Platform Components: You will design, implement, and maintain robust Continuous Integration/Continuous Deployment (CI/CD) pipelines specifically tailored for our platform components. This involves integrating various tools like Argo CD or Atlantis, automating build processes, implementing comprehensive testing strategies, and ensuring seamless deployment of platform updates. You will also focus on optimizing pipeline performance and reducing deployment times.
  • Diagnose, Troubleshoot, and Resolve Platform-Related Issues: You will be the primary point of contact for diagnosing and resolving platform-related issues, including performance bottlenecks, scalability challenges, and security vulnerabilities. This involves utilizing advanced troubleshooting techniques, analyzing logs and metrics, and collaborating with development teams to identify and resolve root causes. You will also contribute to creating comprehensive incident response plans and post-mortem analyses.
  • Drive Automation Initiatives to Streamline Operational Tasks and Enhance System Reliability: You will champion automation initiatives to eliminate manual operational tasks, reduce human error, and improve overall system reliability. This involves developing scripts, tools, and workflows to automate tasks such as infrastructure provisioning, configuration management, and monitoring. You will also proactively identify opportunities for automation and drive continuous improvement in our operational processes.
  • Act as a Strategic Partner to Development Teams, Understanding and Addressing Their Infrastructure Needs: You will foster strong relationships with development teams, acting as a trusted advisor and strategic partner. You will actively engage with them to understand their infrastructure requirements, provide expert guidance on platform capabilities, and ensure that our platform effectively supports their development workflows. You will also translate developer needs into actionable platform improvements.
  • Contribute to the Development of Internal Tools and Services to Enhance Platform Functionality: You will actively participate in the development of internal tools and services that enhance the functionality and usability of our platform. This involves designing, coding, testing, and deploying custom tools and services that address specific platform needs and improve developer productivity. You will also ensure that these tools are well-documented and easily accessible to other team members.
  • Implement and Enforce Rigorous Security Best Practices and Ensure Compliance with Industry Standards: You will be responsible for implementing and enforcing robust security best practices across our platform, including access control, vulnerability management, and data encryption. You will also ensure compliance with relevant industry standards and regulations, such as SOC 2 and GDPR. You will also conduct regular security audits and penetration testing to identify and mitigate potential security risks.
Apply