Senior AI Data Engineer
New
Based in the United StatesFull-TimeSenior
Salary$190K–$210K base range, depending on experience
Apply NowOpens the employer's application page
Job Details
- Experience
- 6+ years
- Required Skills
- DockerPythonSQLGCPData engineeringBigQuerydbtPrompt Engineering
Requirements
- 6+ years of experience in data engineering with ownership of production-grade, mission-critical systems.
- Strong proficiency in Python with hands-on experience building and maintaining large-scale web scraping systems (Scrapy, Playwright, Selenium, BeautifulSoup).
- Proven experience designing and deploying LLM-powered or agentic systems in production environments.
- Strong understanding of prompt engineering, LLM evaluation, observability, and AI system performance trade-offs (latency, cost, quality, reliability).
- Experience building data modeling, transformation pipelines (e.g., dbt), and BI/reporting layers.
- Strong expertise in SQL and hands-on experience with the GCP ecosystem (BigQuery, Cloud Composer, Cloud Storage, Cloud Run/GKE).
- Familiarity with Docker and production system design for scalable data infrastructure.
- Strong reliability mindset with proven ownership of uptime, incident response, and production system stability.
- Understanding of legal and ethical considerations in large-scale web scraping and data acquisition.
- Experience working with AI-assisted development tools (e.g., Claude, Cursor) is highly desirable.
Responsibilities
- Own the end-to-end design, development, and reliability of large-scale data acquisition systems, including web scraping infrastructure and automated data pipelines.
- Build and maintain self-healing scraper systems that use LLMs and agentic workflows to detect, diagnose, and automatically recover from failures.
- Ensure daily data ingestion pipelines remain stable through monitoring, alerting, retry logic, and robust failure handling mechanisms.
- Develop AI-assisted parsing and entity extraction systems to handle unstructured or frequently changing web data.
- Own the data serving layer and ETL/ELT pipelines powering analytics and BigQuery-based data warehouses.
- Design and implement reporting systems, including data models, transformations, dashboards, and AI-driven narrative insights.
- Apply rule-based and ML/LLM-based techniques for data quality monitoring, anomaly detection, and system reliability.
- Build and maintain production-grade MCP servers and agentic workflows for internal and AI-driven data consumption.
- Collaborate with engineering, product, and leadership teams to define system architecture and ensure long-term maintainability.
- Document systems, best practices, and operational workflows to support scalable human-in-the-loop AI operations.
View Full Description & ApplyYou'll be redirected to the employer's site