Apply

ML Engineer (AI/NLP, Vector Search)

Posted 2 months agoViewed

View full description

💎 Seniority level: Middle, 5+ years

📍 Location: South Africa

🔍 Industry: Charity / Non-profit

🏢 Company: Kupa Global

⏳ Experience: 5+ years

🪄 Skills: AWSDockerPythonETLGCPNumpyPyTorchAzurePandasSparkCI/CDRESTful APIs

Requirements:
  • 5+ years in Data Engineering roles with a strong background in Python (Pandas, NumPy, PyTorch).
  • Proven track record working with large language models (e.g., Llama 2) and vector databases (e.g., ChromaDB).
  • Familiarity with containerization (Docker) and CI/CD pipelines (e.g., Jenkins, GitHub Actions).
  • Skilled in setting up AI/ML workflows in cloud environments (AWS, GCP, or Azure).
  • Experience with distributed computing frameworks (Spark, Dask) and additional vector search systems (Milvus, Pinecone) is a plus.
  • Comfortable integrating RESTful APIs, fine-tuning models, and optimizing performance at scale.
  • Strong analytical and troubleshooting abilities with effective communication skills to collaborate across multidisciplinary teams.
Responsibilities:
  • Design and implement vector-based search systems (e.g., ChromaDB) and optimize performance for large-scale datasets, supporting both real-time and batch queries.
  • Install, fine-tune, and deploy large language models like Llama 2 and develop workflows for generating high-quality text summarizations and embeddings.
  • Train and adapt LLMs using domain-specific datasets, continuously evaluating and improving model accuracy, scalability, and efficiency.
  • Develop and maintain robust ETL pipelines in Python, use Docker for containerization, and implement CI/CD pipelines to streamline integration and delivery.
  • Thoroughly document workflows, codebases, and best practices to ensure long-term maintainability and scalability.
Apply