Could anyone help me build a hacker news with tags?
I am asking only those who are interested to have it as well because I only have a budget for the hosting for this.<p>The point is to be able to search through the whole archive using tags/keywords.<p>example of tags:<p>'security'<p>'crm'<p>'a/b testing'<p>'optimization'<p>'http', 'ssl', 'domain name'<p>'scala', 'c++', 'php', etc<p>'lua'<p>'sql'<p>'marketing'<p>'website'<p>'landing page'<p>=> get all posts that relate to each tag (and combinations of tags) <i>sorted by points of individual posts/comments</i>.<p>To do list:
1. import all hacker news database
2. insert in database all tags for all posts/comments, using an algorithm similar to the Kaggle Keyword Extraction algo (https://www.kaggle.com/c/facebook-recruiting-iii-keyword-extraction),
which will need to be refined.
3. create great user interface to the new database<p>-------
or if no-one has the time, could anyone advise me on how to download the whole hacker news database?
1. You can download the dataset using <a href="http://hn.algolia.com/api" rel="nofollow">http://hn.algolia.com/api</a>. Mind the rate-limits, though.<p>2. This has already been done quite a few times by various apps, most prominently here: <a href="http://algorithmia.com/demo/hn" rel="nofollow">http://algorithmia.com/demo/hn</a> (<a href="http://blog.algorithmia.com/post/86295023534/algorithmic-tagging-of-hackernews-or-any-other-site" rel="nofollow">http://blog.algorithmia.com/post/86295023534/algorithmic-tag...</a>)
Can't you just search the keywords? I wonder how useful it would be given that the information (tech articles, such as rails2.1, best features in jQuery1.0,...) will be out-of-date as time goes.<p>I think what's useful is various tools if they are still alive. That's why I want to build a toolbox which collects all the useful tools.<p><a href="https://news.ycombinator.com/item?id=8413016" rel="nofollow">https://news.ycombinator.com/item?id=8413016</a>