How can I get all news on a particular topic in one place? It’s not possible at all. You have to go to five or ten different sites to get news on something. Say you want to know what’s hot today on startup. You may visit Hacker News, TechCrunch and the Next Web to get information on them. You do that daily. But say you are interested in JRuby or Ruby. Where would you go to get hot news on them? Yes you can be seeing content in sub-redits or you can go to a mailing list which talks about it. But you may miss out on news around Ruby in other websites.<p>So what’s the problem here? We do not have a website which allows us to get content from all other websites and categorize them. So how this web site should work? Here is an attempt from us to solve this very problem. We call it - “News Problem”.<p>This web site should have the following:<p>It should be an aggregator of various web sites. It should allow the user to follow one or more topics that he is interested in. it should somehow know how to categorize content from other website and then deliver it to the user.<p>This was the requirement that was there. And guess what two programmers would do about it? Yeah code the ass out to get a service like this.<p>So what we did from last summer?<p>We (Vinod and I) like to work with each other and we have built over time a micro blogging site - ScoopSpot. So we thought, let us use ScoopSpot as a platform to get the news problem solve. So we started off.<p>Problem #1: We need a list of web sites which we want to get news from.<p>We love to visit several web sites, like Hacker News, TechCrunch etc. So we actually have had a list. Also we felt that to keep our web site clean there should be a minimum editing required about web sites to crawl.<p>Problem #2: How can we get the content?<p>This problem is already solved with RSS feeds. So we built a RSS feed reader which would crawl the web sites periodically.<p>Problem #3: How do we define topic?<p>We needed a way to define the concept of topics in our web site. We choose to define it as tag. We added the functionality to follow a tag. We already had following people, so it was natural next move.<p>Problem #3: The big one – how the hell we know which content belongs where?<p>This where we have worked hard of late. We needed a system which can read an article and automatically tags them. We had to build a system which understands English, also a battery of statistical methods were built to get to know what the tags in an article are. We have got some success with this approach – we are not saying that we have solved it completely, there are still work going on – to reduce the false positives.
So we now have a system which gives us news about a lot of favorite topic to us. Here are some links with tag names:<p>Ruby: http://www.scoopspot.com/ruby<p>Startup: http://www.scoopspot.com/startup<p>Music: http://www.scoopspot.com/music<p>If you would like to try out ScoopSpot please login with your Google/Faceboo/Yahoo id. We are looking forward to your feedback.<p>Thanks for reading.