Data Engineering Intern - Geospatial and AI Foundations
Inactive
RemoteInternshipEntry
Salary40 - 45 USD per hour
Apply NowOpens the employer's application page
Job Details
- Required Skills
- PythonSQLAirflowdbtLangChain
Requirements
- Currently pursuing a Master’s or PhD in Computer Science, Data Science, GIS, Geospatial Engineering, or a related field
- Available to work full-time for 3 months during the summer, then part-time through September
- Strong fundamentals in SQL
- Strong fundamentals in Python
- Familiarity with core geospatial concepts (CRS, spatial joins, indexing, spatial trees, optimization)
- Familiarity with AI agent architectures (e.g., ReAct) and protocols (A2A, MCP, AG-UI)
- Exposure to non-LLM geospatial deep learning approaches
- Experience with Airflow or similar orchestration tools
- Familiarity with GIS tooling such as PostGIS, GeoPandas, or QGIS
- Interest in AI-assisted developer tooling
- Ability to pair with senior engineers and participate in reviews and documentation
- Strong communication skills
Responsibilities
- Lead research-oriented analyses such as tree canopy classification, slope and terrain analysis, and spatial feature extraction at scale
- Design and document reproducible analytical workflows
- Translate complex geospatial methods into clear, accessible outputs for non-technical stakeholders
- Connect research outputs to production data pipelines
- Share learnings on emerging GeoAI methods with the team
- Build or improve Airflow ELT pipelines with mentorship and clear documentation
- Write clean, well-structured SQL and Python pipelines
- Develop modular dbt models with semantic layer definitions and documented business logic
- Contribute to data quality systems, including schema validation and freshness monitoring
- Support DataHub adoption through schema documentation and lineage tracking
- Communicate progress through documentation, code reviews, and updates
- Build data agents using tools like LangGraph, LangChain, or Bedrock Agent Core
- Develop and maintain RAG pipelines for natural-language data access
- Iterate on text-to-SQL approaches and document failure modes
- Contribute to MCP server development as needed
- Evaluate agent outputs and refine prompts based on feedback
View Full Description & ApplyYou'll be redirected to the employer's site