TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

Ask HN: Web-scraping – Do patterns/recipes exist for common scraping targets?

10 pointsby kisamotoover 4 years ago
I&#x27;m fairly familiar with web scraping&#x2F;crawling however I was wondering if there is a company&#x2F;tool that has re-usable modules for scraping common websites?<p>Examples could include: scraping article texts from news websites; extracting recipes from Good Food etc.<p>Rather than rewriting what others have - is there an existing library of these scrapers&#x2F;crawlers to use &#x27;out of the box&#x27;?

5 comments

abarrettwilsdonover 4 years ago
Not exactly what you&#x27;re looking for, but there&#x27;s a OSS Chrome Extension that allows you to record your actions in browser and transcribes them into Nightmare.js code:<p><a href="https:&#x2F;&#x2F;github.com&#x2F;segmentio&#x2F;daydream" rel="nofollow">https:&#x2F;&#x2F;github.com&#x2F;segmentio&#x2F;daydream</a><p>Probably the best you&#x27;re going to get - most things worth scraping are worth money, and as such are not freely available
mendeleviumover 4 years ago
For extracting news articles: <a href="https:&#x2F;&#x2F;newspaper.readthedocs.io&#x2F;en&#x2F;latest&#x2F;" rel="nofollow">https:&#x2F;&#x2F;newspaper.readthedocs.io&#x2F;en&#x2F;latest&#x2F;</a>
评论 #24443828 未加载
atmosxover 4 years ago
Try: <a href="https:&#x2F;&#x2F;www.scrapinghub.com&#x2F;" rel="nofollow">https:&#x2F;&#x2F;www.scrapinghub.com&#x2F;</a>
评论 #24443838 未加载
thedevindevopsover 4 years ago
What you&#x27;re looking for sounds very open ended but the closest thing I can think of is the Huginn project on github?
kilroy123over 4 years ago
This is sorely needed IMO.