Lead Data Engineer
New
IndiaFull-TimeLead
Salary not disclosed
Apply NowOpens the employer's application page
Job Details
- Experience
- 7+ years data engineering using Cloud services; 2+ years production AI/ML or LLM-era data infrastructure
- Required Skills
- AWSPythonKafkaMLFlowSnowflakeLLMPySpark
Requirements
- 7+ years of experience in data engineering using Cloud services.
- 2+ years of experience in production AI/ML or LLM-era data infrastructure.
- Proven experience building production-grade pipelines at scale.
- Deep expertise in Python, PySpark, Snowflake, Delta Lake, and Kafka.
- Hands-on experience with vector stores, embedding pipelines, and RAG environments.
- Working knowledge of MLOps tools such as MLflow and CI/CD for AI.
- Strong foundation in data governance, quality frameworks, and compliance-aligned engineering.
Responsibilities
- Build, test, and maintain production batch and real-time pipelines on Snowflake, PySpark, Delta Lake, and Kafka.
- Build end-to-end retrieval infrastructure including document ingestion, embedding pipelines, and vector store management.
- Maintain CI/CD pipelines for automated testing, deployment, and infrastructure-as-code.
- Implement and maintain knowledge infrastructure, including business entity mappings and knowledge graphs.
- Support LLM fine-tuning workflows and build ML data infrastructure for experiment tracking.
- Build and maintain data APIs, tool schemas, and state stores for autonomous agents.
- Implement governance, security, and data quality monitoring including RBAC and PII detection.
View Full Description & ApplyYou'll be redirected to the employer's site