TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

List of resources: Article text extraction from HTML documents

34 pointsby necrodomeabout 14 years ago

1 comment

juiceandjuiceabout 14 years ago
For a while now, I've aliased a version of wget as 'wcat', (alias wcat="wget -qO- -U NoSuchBrowser/1.0") to dump pages directly to my browser so I could quickly search through and use less, sed, and all sorts of other stuff. Integrating text extraction into that would be pretty useful.