科技回声

8 条评论

jhanschoo超过 7 年前

On the slim chance that someone here wants to re-implement the unmodified Kneser-Ney algorithm[0], the presentation of it by the book does not account for unknown tokens in the query not in the vocabulary. I extended the recurrence to its natural closure including unknown tokens here [<a href="https://github.com/jhanschoo/HMMTagger/blob/master/readme.pdf" rel="nofollow">https://github.com/jhanschoo/HMMTagger/blob/master/readme.pd...</a>]. A straightforward task, but it might take you an hour or two (probably more) otherwise to obtain it and prove its correctness, seeing as I couldn't find an extension in a Google search nor is it described in the original paper as well. I believe that it would likewise be straightforward to extend this to modified Kneser-Ney as well.<p>[0]: The modification of using multiple discount values due to Chen & Goodman is regarded as a more well-behaved smoothing, and more popular today.

imurray超过 7 年前

Why link to the pdf?<p>The webpage <a href="https://web.stanford.edu/~jurafsky/slp3/" rel="nofollow">https://web.stanford.edu/~jurafsky/slp3/</a> links directly to the PDF, and gives context and other download options. It's not so easy to go back from the PDF to the webpage.<p>Mods: I'd change the link, and the title to "...3rd Edition draft".<p>Everyone else: please stop linking to PDFs when there is an obvious html page to link to instead.

评论 #16107414 未加载

评论 #16106576 未加载

lgessler超过 7 年前

Note that the 3rd edition isn't entirely finished yet. A full table of planned chapters is here: <a href="https://web.stanford.edu/~jurafsky/slp3" rel="nofollow">https://web.stanford.edu/~jurafsky/slp3</a>

zerkten超过 7 年前

I found speech and language processing to be one of the most interesting courses of my degree. Recently I decided to take a look at speech synthesis again and discovered a book by Paul Taylor on this subject (<a href="http://svr-www.eng.cam.ac.uk/~pat40/" rel="nofollow">http://svr-www.eng.cam.ac.uk/~pat40/</a> and draft PDF at <a href="http://svr-www.eng.cam.ac.uk/~pat40/ttsbook_draft_2.pdf" rel="nofollow">http://svr-www.eng.cam.ac.uk/~pat40/ttsbook_draft_2.pdf</a>). It is more engineering focused than other books in this area.

评论 #16107040 未加载

contingencies超过 7 年前

Often this stuff is used for surveillance. If you choose to study this area, please be careful how you apply your knowledge. There are plenty of positive ways to use it: contributing content classification systems to sci-hub or libgen, building tools for the disabled, automating multilingual visual design aesthetics with computational linguistics and machine learning...

评论 #16105206 未加载

评论 #16105713 未加载

评论 #16105896 未加载

hackernewsacct超过 7 年前

What are good resources for beginners wanting to learn about natural language processing? Are there any good books, tutorials, courses, etc?

评论 #16105893 未加载

评论 #16108408 未加载

评论 #16108137 未加载

JabavuAdams超过 7 年前

I need a voice activity detection module (VAD) for my wearable computer. Should I roll my own, or use someone else's (open-source). My immediate need is speaker-dependent (just me), but it would be nice if I could offer up a speaker-independent version eventually.

评论 #16107114 未加载

akditer超过 7 年前

very good theoretical information

8 条评论

jhanschoo超过 7 年前

imurray超过 7 年前

评论 #16107414 未加载

评论 #16106576 未加载

lgessler超过 7 年前

zerkten超过 7 年前

评论 #16107040 未加载

contingencies超过 7 年前

评论 #16105206 未加载

评论 #16105713 未加载

评论 #16105896 未加载

hackernewsacct超过 7 年前

What are good resources for beginners wanting to learn about natural language processing? Are there any good books, tutorials, courses, etc?

Speech and Language Processing, 3rd ed. draft (2017)

8 条评论

Speech and Language Processing, 3rd ed. draft (2017)

8 条评论