MLOps Engineer
I
InteticsTechnology
Ukraine. Armenia. Georgia. Moldova. TurkeyFull-TimeMiddle
Salary not disclosed
Apply NowOpens the employer's application page
Job Details
- Required Skills
- AWSDockerKubeflowKubernetesMLFlowGrafanaPrometheusTerraformDatabricks
Requirements
- Strong hands-on experience with AWS architecture, including security best practices, IAM, networking, and cost optimization
- Proficiency with Databricks: MLflow, Workflows, Feature Store, cluster management, Unity Catalog
- Experience with cloud-managed ML platforms such as AWS SageMaker or Google Vertex AI
- Expert knowledge of Terraform / Terragrunt for multi-cloud infrastructure provisioning and automation
- Deep expertise in Kubernetes, including autoscaling, GPU workloads, networking policies, and cluster optimization
- Practical experience with observability stacks such as Prometheus, Grafana, Loki, ELK
- Strong understanding of GitOps workflows and CI/CD tools (e.g., ArgoCD, FluxCD)
- Solid knowledge of Docker security, container hardening, and secure container orchestration
- Advanced experience in MLOps practices for continuous training (CT), CI/CD for ML models, and automated deployment
- Familiarity with ML pipeline orchestration tools such as Kubeflow or Argo Workflows
- Experience with LLMOps, including frameworks such as Langfuse, ollama, vLLM, and supporting large-scale inference
- Ability to contribute to architecture design, set platform standards, and mentor MLOps or ML engineers
Responsibilities
- Design and implement scalable, secure, and cost-efficient MLOps solutions leveraging AWS and Databricks
- Automate ML deployment pipelines, reducing manual intervention and operational overhead
- Collaborate closely with data scientists to ensure solutions align with established MLOps architecture, best practices, and platform standards
- Integrate security controls and compliance requirements throughout the entire machine learning lifecycle
- Own and manage incidents end-to-end, from root cause analysis to prevention of future occurrences
- Contribute to software system architecture and the design of platform-level components
- Build and optimize ML training, retraining, and inference pipelines, ensuring reliability and scalability
- Enhance observability with metrics, logging, tracing, and dashboards to ensure system visibility and performance
- Drive best practices in infrastructure automation, CI/CD, and cloud resource management across ML teams
View Full Description & ApplyYou'll be redirected to the employer's site