Senior Software Engineer - Cloud Platform Infrastructure

New
United StatesFull-TimeSenior
Salary not disclosed
Apply NowOpens the employer's application page

Job Details

Experience
5–7+ years
Required Skills
AWSPythonGCPKubernetesAzureGoCI/CDDistributed Systems

Requirements

  • 5–7+ years of experience in platform engineering, SRE, or infrastructure-focused software engineering roles.
  • Strong programming skills in Go and Python, or deep expertise in one with willingness to work across both.
  • Hands-on experience operating Kubernetes in production environments.
  • Strong understanding of distributed systems, cloud-native architectures, and infrastructure design principles.
  • Experience with major cloud providers such as AWS, GCP, or Azure.
  • Proficiency with CI/CD pipelines, infrastructure-as-code, and automation tooling.
  • Experience participating in on-call rotations and managing production incidents effectively.
  • Strong ownership mindset with the ability to work independently in complex technical environments.
  • Excellent communication skills and ability to collaborate across distributed engineering teams.

Responsibilities

  • Design, implement, and operate core cloud infrastructure components supporting a large-scale distributed platform.
  • Build and maintain Kubernetes clusters, including the development of custom operators and platform automation tools.
  • Develop production-grade services and automation in Go and Python to improve infrastructure reliability and efficiency.
  • Improve scalability, performance, and cost optimization across multi-cloud environments (AWS, GCP, Azure).
  • Strengthen observability through metrics, logging, tracing, and monitoring systems to ensure platform reliability.
  • Automate operational workflows and reduce manual intervention through engineering-driven solutions.
  • Collaborate with platform, infrastructure, and product engineering teams to align system design and service integration.
  • Participate in incident response, root cause analysis, and long-term reliability improvements.
  • Drive continuous reduction of operational overhead (KTLO) through system improvements and automation.
  • Contribute to architecture discussions and help evolve cloud-native platform standards and practices.
View Full Description & ApplyYou'll be redirected to the employer's site
View details
Apply Now