Wynd Labs

Private Company

Open Positions2

This is a fully remote team.Full-TimeAI Data InfrastructurePosted

Build and maintain large-scale web crawlers across diverse domains
Design high-throughput, fault-tolerant systems for data collection (millions to billions of URLs/day)
Handle anti-bot systems, rate limits, and dynamic/JS-heavy sites
Develop pipelines for cleaning, deduplication, filtering, and normalization
Construct and maintain datasets for research and model training
Monitor crawl performance, coverage, and data quality; iterate quickly
Collaborate with research teams to align data collection with modeling needs
Optimize infrastructure for cost, latency, and reliability

PythonJavaC+++3 more

Showing 1 of 2 positions