Web Scraping Specialist - EU Remote CET

Posted 2024-10-15

💎 Seniority level: Junior, 2-4 years

📍 Location: Hungary, CET, NOT STATED

🔍 Industry: Fraud prevention and risk detection

🗣️ Languages: English

⏳ Experience: 2-4 years

🪄 Skills: PythonSelenium

2-4 years of experience in web scraping, with a strong focus on data extraction from complex, dynamic websites and unstructured resources.
Proficient in Python and libraries such as Selenium, BeautifulSoup, Scrapy, or equivalent frameworks.
Experience working with third-party proxy providers and rotating proxies to handle scraping challenges.
Knowledge of client faking techniques (e.g., user-agent manipulation, cookie management, header spoofing).
Familiarity with handling common web scraping challenges like CAPTCHAs, rate limiting, and bot detection.
Experience with API interaction and extracting data from both public and private APIs.
Strong problem-solving skills, attention to detail, and the ability to handle large-scale scraping projects.
Familiarity with data cleaning and processing best practices.
Fluent English.

Develop and maintain a scalable in-house built scraping pipeline using Python.
Implement web scraping solutions using tools like Selenium, BeautifulSoup, or similar libraries.
Troubleshoot, optimize and enhance existing scraping workflows and tools.
Cooperate with data scientists and colleagues in developing in-house built data consolidation tools to clean and organize scraped data to ensure it is accurate, reliable, and ready for analysis.
Manage and utilize third-party proxy services to ensure effective data extraction, bypassing anti-scraping mechanisms.
Apply advanced client-faking techniques (e.g., user-agent rotation, CAPTCHA solving, IP masking) to avoid detection.
Collaborate with data engineers and other team members to integrate data into pipelines or systems.
Stay updated on the latest developments in web scraping, proxies, and anti-scraping techniques.