Staff Software Engineer - Inference & Performance

New
R
RunwareAI Infrastructure
United KingdomFull-TimeStaff
Salary not disclosed
Apply NowOpens the employer's application page

Job Details

Required Skills
PHPPythonGoRustDistributed Systems

Requirements

  • Software engineering expertise with a focus on backend and systems (PHP, Python, Go, Rust, or similar)
  • Proven experience building and operating high-performance, low-latency distributed systems in production
  • Deep understanding of asynchronous processing, queues, concurrency models, and back pressure
  • Strong intuition for performance trade-offs across CPU, GPU, networking, storage, and application layers
  • Experience making and defending critical architectural decisions in complex systems
  • Hands-on experience troubleshooting real production issues under load (latency, saturation, cascading failures)
  • Familiarity with modern cloud infrastructure, CI/CD, and observability stacks (metrics, tracing, profiling)
  • Ability to communicate clearly and influence across teams in a remote-first environment
  • Strong mentorship mindset

Responsibilities

  • Own end-to-end inference performance across the platform, with clear responsibility for latency, throughput, and reliability targets
  • Lead the architecture and design of core inference systems, including request routing, async execution, queuing, GPU scheduling, and result delivery
  • Drive the platform toward sub-1 second inference where feasible, identifying bottlenecks across networking, services, storage, and GPU execution
  • Make high-impact architectural decisions with performance, scalability, and operational simplicity as first-class concerns
  • Partner with ML and model teams to ensure models are production-ready from a performance perspective
  • Define performance budgets, SLAs, and success metrics
  • Lead deep-dive investigations into latency spikes and system-level performance issues
  • Influence and mentor engineers across teams on performance engineering and distributed systems
  • Improve tooling, observability, and profiling capabilities
  • Advocate for pragmatic engineering best practices around testing, benchmarking, and documentation
View Full Description & ApplyYou'll be redirected to the employer's site
View details
Apply Now