- Design, implement, and maintain high-availability, high throughput PostgreSQL database systems.
- Define and improve service reliability through monitoring, alerting, and SLO-oriented metrics.
- Lead incident response, root cause analysis, and post-incident corrective actions.
- Partner with technical leaders to ensure system supportability and maintainability.
- Provide on-call coverage for production support.
- Ensure compliance with HIPAA security policies.
- Develop and improve database observability using tools like Datadog, Prometheus, or Grafana.
- Automate maintenance tasks and manage infrastructure as code using Ansible.
PostgreSQLPythonBash+6 more