科技回声

hafabnew将近 13 年前

From the docs:'''* Node.js [...]* jQuery [...][...]This approach has become my hammer when web scraping tasks come up.'''If all you have is a hammer, you may find yourself noticing that objects become more nail-like :).

wskinner将近 13 年前

I have also found node+jQuery an effective web crawling combination. In particular the cheerio library <a href="https://github.com/MatthewMueller/cheerio" rel="nofollow">https://github.com/MatthewMueller/cheerio</a> greatly simplifies data extraction. And as others have mentioned, the asynchronous nature of node is perfectly suited to crawling (as long as you take care not to accidentally DDOS the target site).

latchkey将近 13 年前

If you really want to scrape pages, you should use something like <a href="https://github.com/chriso/node.io/" rel="nofollow">https://github.com/chriso/node.io/</a> which batches things in jobs, helps with error handling, io, etc...

blyxa将近 13 年前

why not use the twitter api?

评论 #4343252 未加载

评论 #4343400 未加载

评论 #4343582 未加载

评论 #4343272 未加载

danso将近 13 年前

Does Node have anything like Mechanize? Handling cookie state and such is something that is much more useful than the selector functionality of jQuery...which is great, but not any better than what Nokogiri offers.

评论 #4343884 未加载

评论 #4343425 未加载

Using Node.js and JQuery to Crawl Public Tweets

5 条评论

Using Node.js and JQuery to Crawl Public Tweets

5 条评论