- Own the architecture, development, and operation of scalable, secure, and fault-tolerant cloud services, with accountability for performance and reliability in production.
- Drive technical design and architectural decisions for distributed systems, influencing patterns, standards, and long-term platform evolution.
- Lead complex initiatives end-to-end, from design through deployment and ongoing optimization, ensuring alignment with business and technical priorities.
- Build and scale cloud infrastructure using infrastructure-as-code (Terraform, Helm) and container orchestration (Kubernetes), improving system resilience and efficiency.
- Advance cloud security and compliance practices, embedding secure design principles, IAM controls, and encryption into all layers of the platform.
- Improve system observability and operational excellence, implementing robust monitoring, alerting, and incident response strategies.
- Drive DevOps maturity, optimizing CI/CD pipelines and deployment strategies to support rapid, reliable delivery.
- Collaborate cross-functionally with engineering, product, and security teams to define solutions and resolve complex system-level challenges.
- Mentor and guide engineers, providing technical direction, code reviews, and support for skill development while raising the bar for engineering quality.
- Continuously evaluate and adopt new technologies, making pragmatic decisions that improve system performance, scalability, and developer productivity.
AWSPythonKubernetes+4 more