Sr. Platform Engineer

New
United StatesFull-TimeSenior
Salary not disclosed
Apply NowOpens the employer's application page

Job Details

Required Skills
AWSKubernetesCI/CDTerraformGitHub ActionsDatadogCloudFormation

Requirements

  • Strong experience in infrastructure engineering, platform engineering, DevOps, or site reliability engineering roles
  • Hands-on expertise with AWS production environments, including infrastructure design and operational management
  • Advanced proficiency with Infrastructure as Code tools, particularly Terraform, with practical production-level usage
  • Solid experience managing Kubernetes clusters in production, including deployment, configuration, and ongoing maintenance
  • Demonstrated ability to design and operate CI/CD pipelines, especially using GitHub Actions
  • Experience implementing observability and monitoring solutions such as Datadog, including metrics, logging, and alerting frameworks
  • Strong understanding of containerization workflows, including image optimization and efficient build strategies
  • Ability to operate effectively in evolving environments where priorities shift and ambiguity is common
  • Strong collaboration and communication skills, with a pragmatic, iterative approach to problem-solving
  • Experience in startup or high-growth environments and exposure to platform engineering practices is highly valued

Responsibilities

  • Design, build, and maintain scalable, secure, and reliable cloud infrastructure in AWS, ensuring strong operational performance and automation across systems
  • Develop and manage Infrastructure as Code solutions using tools such as Terraform and CloudFormation to support repeatable and version-controlled deployments
  • Deploy, operate, and optimize Kubernetes clusters in production environments, ensuring high availability and efficient workload orchestration
  • Build and maintain CI/CD pipelines using tools such as GitHub Actions, with potential exposure to Jenkins or ArgoCD for deployment automation
  • Implement and improve observability systems, including monitoring, logging, alerting, and incident response practices (e.g., Datadog or similar tools)
  • Support containerized application workflows, including image build pipelines, optimization, and deployment strategies
  • Collaborate with engineering teams to troubleshoot infrastructure issues, perform root-cause analysis, and drive long-term system improvements
  • Participate in architecture discussions, technical planning, and ongoing platform evolution initiatives to improve reliability and developer experience
View Full Description & ApplyYou'll be redirected to the employer's site
View details
Apply Now