Research Crawling Engineer

New
W
Wynd LabsAI Data Infrastructure
This is a fully remote team.Full-Time
SalaryCompetitive salary, benefits and equity package.
Apply NowOpens the employer's application page

Job Details

Required Skills
PythonJavaC++GoRustDistributed Systems

Requirements

  • Strong programming experience in one or more of: Go, Rust, Python, Java, or C++
  • Experience building web crawlers or large-scale data pipelines
  • Solid understanding of HTTP, networking, and browser behavior
  • Familiarity with distributed systems and parallel processing
  • Experience working with large datasets (TB–PB scale preferred)
  • Ability to debug unstable or adversarial environments

Responsibilities

  • Build and maintain large-scale web crawlers across diverse domains
  • Design high-throughput, fault-tolerant systems for data collection (millions to billions of URLs/day)
  • Handle anti-bot systems, rate limits, and dynamic/JS-heavy sites
  • Develop pipelines for cleaning, deduplication, filtering, and normalization
  • Construct and maintain datasets for research and model training
  • Monitor crawl performance, coverage, and data quality; iterate quickly
  • Collaborate with research teams to align data collection with modeling needs
  • Optimize infrastructure for cost, latency, and reliability
View Full Description & ApplyYou'll be redirected to the employer's site
Competitive salary, benefits and equity package.
Apply Now