TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

HNSummaries.com - algorithmically summarized HN articles to your inbox

82 pointsby dyabout 13 years ago

17 comments

dyabout 13 years ago
Would love to get people's feedback - built this for myself over the weekend as a way to accelerate and limit my reading of HN (and being inspired by the NLP course from Stanford).<p>The NLP is pretty basic and takes a ratio of the original article, so you do get some longer listings.<p>Big thanks to Wayne Larsen of hckrnews.com for providing me with some insight on tracking top stories and letting me use his ranking data. Also, I recommend <a href="http://www.hackernewsletter.com/" rel="nofollow">http://www.hackernewsletter.com/</a> for a human-curated version.
评论 #3836005 未加载
评论 #3834102 未加载
评论 #3835394 未加载
评论 #3834137 未加载
评论 #3834103 未加载
Dn_Ababout 13 years ago
Here is a simple recipe to do similar that works decently well as a start off point:<p>--------------------------------------------------------<p>Count how many times each word appears in the document into a dictionary or map structure<p>Also make sure you track the total words.<p>document |&#62; splitBySpace |&#62; if dictionary has word then +1 else 1; totalwords++<p>Then split the document into sentences.<p>Okay, now for each sentence<p>==========================================<p>score = 0.<p>split sentence by space and<p>for each word score+= -(dictionary[word]/sum) * log(dictionary[word]/sum)<p>dictionaryScore.Add(sentence, score)<p>==========================================<p>So now each sentence has a score. You can sort by best and lose order. Or if you want to limit (0 - 1) based on score:<p>findbestScore and filter each sentence by if limit &#60; docscore / bestscore.<p>As I said this is only a start off point and is susceptible to list of random words (guess why) there are many ways to make it better. Here is a portion of code I dug up from a while ago:<p><pre><code> let inline sumMap m = m |&#62; Map.fold (curryfst (+)) 0. let inline internal countsAndSum n doc = let counts = splitstr [|" "|] doc |&#62; filterStop n |&#62; Array.fold mapAdd Map.empty counts, sumMap counts let ent m sum k = let p = (mapGet m k 0.)/sum if p = 0. then 0. else -p * log2 (p) let eScore doc = let counts , sum = countsAndSum 0 doc splitSentenceRegEx doc |&#62; Array.map (fun str -&#62; str, splitstr [|" "|] str |&#62; Array.fold (flip ((+) &#60;&#60; (ent counts sum))) 0.)</code></pre>
评论 #3834861 未加载
评论 #3834421 未加载
marknutterabout 13 years ago
I personally don't want algorithmically summarized content, I want manually summarized content by knowledgeable HN users. It's half the reason I click into the comments 99% of the time before clicking into the linked article. I want interesting insight along with a good summary of what the main points were being communicated. There's just no way automatically generated summaries can compete with that.
评论 #3834053 未加载
评论 #3834442 未加载
评论 #3836049 未加载
ankimalabout 13 years ago
Just got my first newsletter. Looking good for an initial release. Some feedback:<p>Would love to get an index of headlines on top of the email with anchors to actual stories below.<p>Would love to see shorter summaries and maybe some of the top comments for each story (summarized, if possible).
评论 #3833928 未加载
petercooperabout 13 years ago
Bear in mind that comments here are self selecting for people who like HN's comments section ;-) But I know plenty of people and speak to people on Twitter who <i>deliberately</i> avoid these comments pages due to a perception (fair or not) of "drama" and what not. For those folks, an email like this could be just the ticket. For me though, I'm staying here ;-)
moconnorabout 13 years ago
Thanks for sharing this, I'm curious to see how well it works out over time. It'd be nice to be able to choose the compression level.<p>Quality feels at least as good as an open source summarizer I played around with a while back; good work!
评论 #3833933 未加载
Timotheeabout 13 years ago
One thing I commend you for is to ask when I want to receive the email. It's surprising that barely any mailing-list or newsletter lets you pick that…
评论 #3835465 未加载
jilebedevabout 13 years ago
Great execution, but I'm uncertain of the idea. My personal perspective: I read wikipedia for information -- I read HN for critical insight. Not always present, but a higher signal/noise ratio than other websites. I don't want a summary of information - I want critical thought.
eaurougeabout 13 years ago
Why only 20 stories? I usually scan the first three pages once a day, a snapshot of the top 90 articles. Only about 10% are relevant so I'd rather have more summaries to sift thru to find the ~10 relevant articles for the day.
dreevesabout 13 years ago
I actually thought algorithmic summaries would be worse than useless but they seem surprisingly good. Here's the one from Caine's Arcade:<p>"9 year old Caine sets up an arcade in his father’s used car parts store in East L.A., using only cardboard boxes his dad had lying around and a ton of ingenuity. Watch his dreams come true when this filmmaker sets up a flash mob to come and play. Just watching this may make you a better person. $82,000 has already been raised for Caine’s scholarship fund! little behind on the bandwagon, but...film just had me in tears."
评论 #3833908 未加载
DanielBMarkhamabout 13 years ago
I plan on adding this to my <a href="http://newspaper23.com" rel="nofollow">http://newspaper23.com</a> site. It's just way on the back burner.<p>Ideally I think you would do it client-side, so readers could adjust the shrinkage to the amount of time they have to peruse. I was also thinking about a scenario where you could browse at say 100-words and then dive-deep if you found anything that interests you. A more interactive approach. You might want to consider this.<p>But I really like the idea. Would love to hear how the project goes!
sabalabaabout 13 years ago
I got my first email, here's some feedback.<p>You should make sure that the summaries don't scale linearly with the size of the content--just because an article is 10x as long, doesn't mean I want a summary to be 10x longer. Maybe scale logarithmically?<p>I didn't find any of the summaries to be high quality or any better than I could get from briefly skimming HN myself.<p>I've unsubscribed.
chrishanabout 13 years ago
I am taking an alternative approach to make sense of HN stories for Chinese readers. As a regular HN reader, I manually summarize the topic of top stories and translate them into Chinese. The motivation is to lower the startup/tech news sharing barriers. Link - <a href="http://geektell.com/" rel="nofollow">http://geektell.com/</a>
mistermannabout 13 years ago
Really like it!<p>One small suggestion...could you make the "76 comments" under the title clickable through to the HN comments?<p>One other option (maybe a user preference), include some noteworthy excerpts from the HN comments in the email as well?
sabalabaabout 13 years ago
Feature Request:<p>It would be great to get a weekly or monthly summary.<p>Nice work.
gootikabout 13 years ago
why email? I'd like to see the summaries in a web page too.
SeoxySabout 13 years ago
How about giving writers the respect they deserve and not algorithmically rewriting their work? Has our attention span really gotten so short that we cannot read articles of substance any longer?
评论 #3834069 未加载
评论 #3834535 未加载
评论 #3834600 未加载