Sr. Software Engineer-AI Reliability

New
M
MixModeCybersecurity AI
Based in the United StatesFull-TimeSenior
Salary150,000 - 210,000 USD per year
Apply NowOpens the employer's application page

Job Details

Experience
7+ years of professional software engineering experience
Required Skills
PythonSQLJavaKotlinKubernetesMachine LearningScalaDistributed Systems

Requirements

  • 7+ years of professional software engineering experience
  • Strong proficiency in Python
  • Strong proficiency in at least one JVM language (Java, Scala, Kotlin)
  • Proven experience designing, building, and operating distributed systems in production
  • Strong understanding of service architecture, concurrency, resource management, and distributed failure modes
  • Experience operating Kubernetes deployments
  • Strong experience with relational databases, including query performance analysis, indexing, and connection management
  • Demonstrated ability to diagnose and resolve performance, scalability, and reliability issues
  • Experience implementing automated testing and production observability (logging, metrics, tracing)
  • Experience collaborating with ML or data science teams
  • Ability to travel to our office in Santa Barbara, CA, a few times per year

Responsibilities

  • Own the reliability, performance, and operational health of production AI services
  • Refactor and harden existing systems to improve resilience, clarity, and maintainability
  • Diagnose and resolve issues across distributed services, data pipelines, and storage layers
  • Design and implement monitoring, alerting, and debugging tools for high-availability systems
  • Partner with researchers and engineers to productionize predictive systems at scale
  • Establish best practices for testing, deployment, capacity planning, and incident response
  • Contribute to incident response and postmortems, driving continuous improvement
View Full Description & ApplyYou'll be redirected to the employer's site
150,000 - 210,000 USD per year
Apply Now