TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

How do news aggregators get a list of news sites?

10 点作者 kevinfat超过 11 年前
If I wanted to make a news aggregator site how do I get a curated list of news sites such as http:&#x2F;&#x2F;www.latimes.com&#x2F; http:&#x2F;&#x2F;www.washingtonpost.com&#x2F;<p>Its not realistic for me to manually compile a list of them all by myself. How did the popular news aggregator sites build a comprehensive list?

5 条评论

ScottWhigham超过 11 年前
I think, logically, the question we should ask you is, &quot;Why are you considering making a news aggregator site when you don&#x27;t have the time to curate&#x2F;determine which sources are good sources?&quot; As a user, when I see a site that has &quot;a list of news articles&quot;, I&#x27;m utterly underwhelmed and&#x2F;or intimidated by the sheer number of articles. However, when I see a site that has a list of articles from good sources that I&#x27;d find interesting, I&#x27;ll bookmark it.
ig1超过 11 年前
Google news sources as of 2011:<p><a href="http://img.labnol.org/files/Google-News.txt" rel="nofollow">http:&#x2F;&#x2F;img.labnol.org&#x2F;files&#x2F;Google-News.txt</a>
评论 #6454973 未加载
al1x超过 11 年前
Why not scrape Alexa&#x27;s list of the top 500? -- <a href="http://www.alexa.com/topsites/category/Top/News" rel="nofollow">http:&#x2F;&#x2F;www.alexa.com&#x2F;topsites&#x2F;category&#x2F;Top&#x2F;News</a> As a side note, not to ruin your party or anything, but over the years a handful of HN users have made news aggregators as side projects and none of them have really gone anywhere. You might want to think about putting your effort into something else. Google News is a pretty sweet product.
评论 #6444970 未加载
评论 #6448324 未加载
aviv超过 11 年前
It&#x27;s your lucky day. There are two data sets you can purchase for a decent price:<p>- 30M news headlines and 500K web sources, 30gb of JSON data ($300)<p>- 15K news domains that are the most popular in US market ($100)<p>These were gathered by Andrew Montalenti, co-founder of Parse.ly. See more info here: <a href="http://pixelmonkey.org/pub/python-crawling-slides/" rel="nofollow">http:&#x2F;&#x2F;pixelmonkey.org&#x2F;pub&#x2F;python-crawling-slides&#x2F;</a>
steerpike超过 11 年前
You might find something useful in this list of News related APIs<p><a href="http://www.programmableweb.com/apis/directory/1?apicat=News" rel="nofollow">http:&#x2F;&#x2F;www.programmableweb.com&#x2F;apis&#x2F;directory&#x2F;1?apicat=News</a>