Principal AI/ML Platform Engineer
MississaugaFull-TimePrincipal
Salary169,000 - 188,000 CAD per year
Apply NowOpens the employer's application page
Job Details
- Required Skills
- KubernetesMLFlowCI/CDLLM
Requirements
- Extensive experience building and maintain AI platform infrastructure, Kubernetes, and container security.
- Demonstrated expertise in observability, and monitoring frameworks, with a focus on real-time performance (i.e: experience with OpenTelemetry, MLFlow).
- Experience with AI infrastructure components such as vector databases, prompt/versioning stores, and AI IDEs.
- Familiarity with vLLM, SGLang or similar framework to host LLM inference workloads.
- Experience with CI/CD pipelines and automation for AI model deployment and platform operations.
- Strong knowledge of authentication and authorization frameworks integrated into AI platforms.
Responsibilities
- Design, build, and maintain the core infrastructure layer supporting GenAI products, including model gateways, prompt/versioning stores, vector databases, and LLM evaluation tools.
- Implement secure access controls and authentication mechanisms integrated by default into the AI platform components.
- Develop and manage observability, monitoring, and logging solutions for GenAI workloads and infrastructure.
- Collaborate closely with product and engineering teams to integrate GenAI infrastructure with agent frameworks, and downstream applications.
- Optimize infrastructure for scalability, high availability, cost efficiency for production workloads.
View Full Description & ApplyYou'll be redirected to the employer's site