Sr. Software Engineer-AI Reliability
New
M
MixModeCybersecurity AI
Based in the United StatesFull-TimeSenior
Salary150,000 - 210,000 USD per year
Apply NowOpens the employer's application page
Job Details
- Experience
- 7+ years of professional software engineering experience
- Required Skills
- PythonSQLJavaKotlinKubernetesMachine LearningScalaDistributed Systems
Requirements
- 7+ years of professional software engineering experience
- Strong proficiency in Python
- Strong proficiency in at least one JVM language (Java, Scala, Kotlin)
- Proven experience designing, building, and operating distributed systems in production
- Strong understanding of service architecture, concurrency, resource management, and distributed failure modes
- Experience operating Kubernetes deployments
- Strong experience with relational databases, including query performance analysis, indexing, and connection management
- Demonstrated ability to diagnose and resolve performance, scalability, and reliability issues
- Experience implementing automated testing and production observability (logging, metrics, tracing)
- Experience collaborating with ML or data science teams
- Ability to travel to our office in Santa Barbara, CA, a few times per year
Responsibilities
- Own the reliability, performance, and operational health of production AI services
- Refactor and harden existing systems to improve resilience, clarity, and maintainability
- Diagnose and resolve issues across distributed services, data pipelines, and storage layers
- Design and implement monitoring, alerting, and debugging tools for high-availability systems
- Partner with researchers and engineers to productionize predictive systems at scale
- Establish best practices for testing, deployment, capacity planning, and incident response
- Contribute to incident response and postmortems, driving continuous improvement
View Full Description & ApplyYou'll be redirected to the employer's site