Senior AI Platform Engineer

New
BrazilFull-TimeSenior
Salary not disclosed
Apply NowOpens the employer's application page

Job Details

Required Skills
DockerPythonCloud ComputingKubernetesCI/CDTerraformHelmMLOps

Requirements

  • Strong hands-on experience with cloud platforms (AWS, Azure, or GCP)
  • Solid experience with Kubernetes, containers (Docker), Helm, and cloud-managed services
  • Proven experience with CI/CD tools such as GitHub Actions, GitLab CI, Azure DevOps, or Jenkins
  • Strong knowledge of Infrastructure as Code tools such as Terraform, Pulumi, or CloudFormation
  • Experience implementing observability solutions (logging, monitoring, tracing, dashboards, alerting)
  • Solid understanding of networking, load balancing, authentication, authorization, and high availability architectures
  • Experience operating production systems, handling incidents, and improving system reliability
  • Ability to document technical standards, architecture decisions, and platform practices clearly
  • Experience with MLOps/LLMOps, GPU workloads, or AI platforms is a strong plus
  • Familiarity with tools such as vLLM, Triton, NVIDIA NIM, MLflow, Kubeflow, or Ray is a plus

Responsibilities

  • Design, build, and evolve cloud-native and Kubernetes-based environments for AI workloads, APIs, services, and data pipelines
  • Develop and maintain CI/CD pipelines with strong focus on security, traceability, standardization, and deployment efficiency
  • Implement Infrastructure as Code practices to automate provisioning and management of cloud resources
  • Define and implement observability standards, including logging, metrics, tracing, alerting, and platform health monitoring
  • Support AI workloads such as inference services, embeddings, agent orchestration, and model integration components
  • Ensure robust security practices including identity and access management, secrets handling, and environment segregation
  • Drive platform reliability through performance tuning, scalability improvements, resiliency, and cost optimization (FinOps)
  • Create reusable architectural patterns for deployment, monitoring, infrastructure, and operational workflows
  • Collaborate with AI, Data, Product, and Engineering teams to translate needs into scalable platform capabilities
View Full Description & ApplyYou'll be redirected to the employer's site
View details
Apply Now