Own reliability, availability, and performance of production systems running in cloud environments
Define and monitor SLIs/SLOs and help manage error budgets across the platform
Lead incident response efforts including detection, triage, mitigation, and postmortems
Improve observability through logging, monitoring, alerting, and dashboards
Automate operational workflows and reduce manual toil wherever possible
Partner closely with engineering teams to improve system resiliency and scalability
Assist with capacity planning, infrastructure optimization, and performance tuning
Build internal tooling, runbooks, and operational best practices
Support Kubernetes-based infrastructure and distributed systems at scale
Act as an escalation point for complex production and platform issues
AWSPythonBash+7 more
Showing 1 of 7 positions
About Orkes
Orkes empowers developers to build and scale reliable, distributed, event-driven applications, serving industries from fintech to healthcare. We provide a managed, cloud-hosted version of Conductor, the powerful open-source orchestration engine. You will create workflows across microservices, gain actionable insights, and iterate quickly with visual representations. Orkes handles mission-critical operations like order management and data protection. We ensure your applications can process billions of workflows without worrying about failures or scalability.
How We Work
Orkes cultivates a remote-friendly culture with strong engineering autonomy. We operate as a fully distributed team, prioritizing diversity to drive innovation and creativity. You will find an inclusive environment where every team member feels valued. We believe in a remote-first approach, empowering you to bring your authentic self to work. Our collaboration spans across various locations, fostering trust and ownership.
Engineering at Orkes
Orkes is building the backbone for distributed applications, originating from Netflix's battle-tested open-source project, OSS Conductor. We solve complex infrastructure problems, making distributed systems reliable, observable, and scalable. Our platform powers billions of mission-critical workflows across diverse industries. You will work on real and difficult problems developers face daily. We absorb system complexity, allowing teams to build with confidence. Your ideas will directly shape the product direction.
Why Join Us
Shape the future of cloud orchestration and AI workflows.
Work alongside a deeply technical and collaborative engineering team.
Solve complex distributed systems challenges at scale.
High ownership and impact in a fast-growing company backed by $29.3M in funding.
Contribute to a remote-friendly culture with strong engineering autonomy.
Benefits & Perks
Comprehensive health coverage including medical, dental, and vision.