- Design, develop, test, and maintain strong and scalable data pipelines using Python and tools for large-scale data processing.
- Design and take ownership of key parts of ML systems to ensure reliability and scalability.
- Set up and manage MLOps practices, including CI/CD for model updates, monitoring, and launch plans.
- Improve and manage data processing jobs on GCP (Dataproc, BigQuery, Cloud Run, Cloud Build).
- Collaborate with data scientists to productionize machine learning models.
- Create detailed documentation for system designs and code.
- Troubleshoot complex technical problems in distributed data systems and ML pipelines.
DockerPythonSQL+6 more