TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

Ask HN: Web scraping easy-medium-hard challenges?

9 pointsby isoosover 4 years ago
I&#x27;d like to help a friend to learn more about web scraping (and also with test automation, but that is less fun). Are you aware of any tutorial, competition or anything in-between which has tasks with varying difficulties?<p>E.g. easy: iterate over the ids of the articles and call curl on it. difficult: you need puppeteer with multiple JS tricks to get through the first few pages, and the end is far away...

4 comments

tdeckover 4 years ago
Surprised nobody has mentioned ASP websites yet - definitely among the hardest. Those sites carry so much state in cookies rather than URLs, so you have to follow all the UI interactions in order to get to the result you&#x27;re trying to parse. The markup is also typically really bloated and filled with randomly-generated IDs.
ev1over 4 years ago
Easy: find a random Wordpress blog. Crawl by category, author, or page.<p>Medium: Scrape Yelp.<p>Hard: Scrape Yelp and exclude all randomly generated garbage data, false phone numbers, incorrect hours when they detect you&#x27;re a bot and start feeding you bad data instead of blocking you.<p>Hard, expensive: Purchase a pair of limited edition sneakers requiring 3D Secure and 2FA.
quickthrower2over 4 years ago
An easy challenge that is also very fruitful is “scraping” RSS feeds. A lot of good information is provided by RSS and the challenge could be to aggregate and filter some RSS feeds then generate a new one.
rdtwoover 4 years ago
Medium scrape Craigslist, create a database with your results and graph out prices.<p>Then link up reposts to track price history<p>Use image recognition to find reused images<p>- medium hard Use web scraping to buy a ps5