Senior Software Engineer - Cloud Platform Infrastructure
New
United StatesFull-TimeSenior
Salary not disclosed
Apply NowOpens the employer's application page
Job Details
- Experience
- 5–7+ years
- Required Skills
- AWSPythonGCPKubernetesAzureGoCI/CDDistributed Systems
Requirements
- 5–7+ years of experience in platform engineering, SRE, or infrastructure-focused software engineering roles.
- Strong programming skills in Go and Python, or deep expertise in one with willingness to work across both.
- Hands-on experience operating Kubernetes in production environments.
- Strong understanding of distributed systems, cloud-native architectures, and infrastructure design principles.
- Experience with major cloud providers such as AWS, GCP, or Azure.
- Proficiency with CI/CD pipelines, infrastructure-as-code, and automation tooling.
- Experience participating in on-call rotations and managing production incidents effectively.
- Strong ownership mindset with the ability to work independently in complex technical environments.
- Excellent communication skills and ability to collaborate across distributed engineering teams.
Responsibilities
- Design, implement, and operate core cloud infrastructure components supporting a large-scale distributed platform.
- Build and maintain Kubernetes clusters, including the development of custom operators and platform automation tools.
- Develop production-grade services and automation in Go and Python to improve infrastructure reliability and efficiency.
- Improve scalability, performance, and cost optimization across multi-cloud environments (AWS, GCP, Azure).
- Strengthen observability through metrics, logging, tracing, and monitoring systems to ensure platform reliability.
- Automate operational workflows and reduce manual intervention through engineering-driven solutions.
- Collaborate with platform, infrastructure, and product engineering teams to align system design and service integration.
- Participate in incident response, root cause analysis, and long-term reliability improvements.
- Drive continuous reduction of operational overhead (KTLO) through system improvements and automation.
- Contribute to architecture discussions and help evolve cloud-native platform standards and practices.
View Full Description & ApplyYou'll be redirected to the employer's site