TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

How to understand half of Harry Potter book in any language (+ source code)

35 点作者 legierski大约 13 年前

13 条评论

pooriaazimi大约 13 年前
&#62; <i>... as I’ve already read all 7 books in 2 languages before - I could pick up a lot just from context...</i><p>That's what I've always said to my friends, but they insist on learning <i>grammar</i>, which of course they'll forget after a few hours/days. Three years ago I could hardly read anything non-technical in English, now I can understand <i>Lolita</i> - which is a hard novel to read for a non-native. All thanks to audiobooks.<p>I found that audiobooks are the most fantastic way of learning a new language... I couldn't have possibly read <i>Lolita</i> or <i>Silmarillion</i> before - They're just too hard for someone who is trying to learn a new language. Long sentences full of new/invented words - It's easy to lose the thread. But listening to a skillful reader reading them aloud, you can understand even the most complicated words and sentences just from the context, and the reader's tone and emphasize...<p>If you want to learn a new language, do yourself a favor and listen to some audiobooks. Pick a book you've already read in your native language (preferably more than once) and you'll be amazed how easy it is to understand and learn new words (you must have a basic understanding of that language of course).
评论 #3827510 未加载
评论 #3827582 未加载
评论 #3827422 未加载
hkolek大约 13 年前
I like the approach but I think it's a big mistake to not strip stop words. He should focus on nouns and verbs imo. The top words he lists are all stopwords/grammatical particles "de", "que", "la", "y" etc. I don't think knowing those words will help to understand anything. I think if you understand only the grammatical particles in a sentence it won't help at all to understand the meaning of the sentence. On the other hand if you know the verbs and nouns but not the grammatical particles you can at least infer some meaning or what it's about.
评论 #3827280 未加载
评论 #3827420 未加载
acslater00大约 13 年前
TL;DR Nearly half of the word occurrences in Harry Potter are prepositions, so if you learn a small number of them you can claim that you "understand half of Harry Potter". For example, you can absorb sparking dialogue like the following:<p>"Harry and to at to I to with Voldemort or what to and I do for, Hermoine!!"
评论 #3827557 未加载
评论 #3827410 未加载
评论 #3827465 未加载
mseebach大约 13 年前
So, the tl;dr is that this guy discovered that using a dictionary is a good way of learning words in a different language.<p>In more detail, there's an assertion and a proposed solution - and nothing to even remotely back up the assertion? Show me a page of Harry Potter in Spanish translated in this manner - I somewhat doubt it will make much sense.
评论 #3827905 未加载
pm215大约 13 年前
There's some similar statistics for Japanese novels here: <a href="http://pomax.nihongoresources.com/index.php?entry=1223045359" rel="nofollow">http://pomax.nihongoresources.com/index.php?entry=1223045359</a> which I think show that the problem is not at the "most common" end of the distribution but at the "least common" end. The jump between '80% understanding' and '90% understanding' requires knowing an extra 5341 words, 90% to 95% needs another 7495, and so on. Basically the long tail is really nasty, and even 90% understanding is still not knowing one word in ten...
评论 #3827839 未加载
korussian大约 13 年前
That's a fantastic idea. I'm struggling to learn Korean, and it's tough because few of the words are recognizable to me. I have a base of English/French/Russian, so that doesn't help (much).<p>I would love to try to put my grammar/flash cards aside and go the Harry Potter route.<p>I can't code. What could you do to help me, an average user, do this with Korean?
评论 #3827379 未加载
评论 #3829014 未加载
评论 #3827281 未加载
评论 #3827248 未加载
评论 #3827484 未加载
krelian大约 13 年前
&#62;did you know that out of 5 most popular languages in the world, 3 of them are relatively easy to acquire? They are: English, Spanish and Russian, and my plan is to be fluent in English and Spanish and be able to get by with Russian by the end of 2013! Who’s with me?!)<p>I'll grant that Spanish is relatively easy but Russian is considered one of the most difficult languages to learn. It's hard for me to judge the difficulty of English but I wouldn't say it is an easy language.
评论 #3827904 未加载
goblin89大约 13 年前
I like HP series in this regard. Vocabulary complexity slightly increases with each book, which helps to progressively learn a language.
评论 #3827413 未加载
nathell大约 13 年前
Reinventing Zipf's law, huh?
评论 #3827169 未加载
rivalis大约 13 年前
Nlp folks call those "stopwords," because they don't contribute much to statistical understanding of text. That is, in most nlp applications, those words are removed to leave more meaningful text behind. How did this make front page?
wrs大约 13 年前
Linguality (<a href="http://www.linguality.com/" rel="nofollow">http://www.linguality.com/</a>) prints French and Italian novels with the original text on the right pages and a page-specific mini-dictionary on the left pages. No need to keep stopping to look up words in a dictionary.<p>Unfortunately there are only a few Linguality books. Can you do this with an e-reader?
tolliator大约 13 年前
As a native russian speaker, I can say with utmost certainty that Russian is <i>NOT</i> the easiest language to learn. In fact, I would argue that it is somewhere on the upper scale of difficulty.<p>I have been living in North America only 10 years - and I can't even teach Russian to my own kids - we had to get a tutor.
raphman大约 13 年前
Nice. Additional stemming would probably provide better data, however.