TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

Top JavaScript libraries for web scraping

1 pointsby fullofdevover 1 year ago

1 comment

gajusover 1 year ago
I am obviously biased, but Surgeon is by far the best abstraction for data extraction I&#x27;ve ever seen or have written.<p><a href="https:&#x2F;&#x2F;github.com&#x2F;gajus&#x2F;surgeon">https:&#x2F;&#x2F;github.com&#x2F;gajus&#x2F;surgeon</a><p>For context, it was created to support my previous business Applaudience. We had to build scrapers for (literally) thousands of cinema websites. At some point we migrated from custom scrips to Surgeon routines and reduced overall codebase size by 70% LoC. It was a huge time saver in terms of both writing new integrations and debugging when things go wrong.<p>The reality is that data extraction is a highly specialized task and you need specialized software to do it well. Tools like Surgeon can abstract a ton of complexity, but they have steeper learning curve.<p>I still use it whenever I need to scrape anything and you can combine it with anything that outputs HTML e.g. Playwright.<p>Ultimately business died (covid was brutal) and I moved on to even more exciting things, but this remains one of those technologies that I wish would have received more adoption.
评论 #38077384 未加载