TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

NLP in Python vs other Programming Languages

32 点作者 rayvega将近 15 年前

14 条评论

jules将近 15 年前
His Ruby<p><pre><code> for line in ARGF for word in line.split if word.match(/ing$/) then puts word end end end </code></pre> I'd write as<p><pre><code> for line in ARGF puts line.split.grep(/ing$/) end </code></pre> Or<p><pre><code> puts ARGF.map{|line| line.split.grep(/ing$/)}</code></pre>
评论 #1462799 未加载
评论 #1462869 未加载
lars512将近 15 年前
There's many dimensions on which we can evaluate programming languages, but the NLTK folk are only really interested in one: readability. Their page implicitly argues that high-level languages with good string processing are the most readable, and that amongst those Python is more readable than the alternatives (for both non-programmers and experts).<p>NLTK is supposed to be an educational toolkit. It's used by linguists taking their first steps in programming, and by CS students taking their first steps in complexity and mess of human language. They're not looking for the shortest code, the fastest code, or the most &#60;quality attribute X&#62; code, just the most readable, insofar as readability can be supported and encouraged by a language.
评论 #1462969 未加载
tspiteri将近 15 年前
A C++ version:<p><pre><code> #include &#60;iostream&#62; #include &#60;string&#62; int main() { std::string s; while (std::cin &#62;&#62; s) { if (s.size() &#62;= 3 &#38;&#38; s.match(s.size()-3, 3, "ing") == 0) { std::cout &#60;&#60; s &#60;&#60; '\n'; } } }</code></pre>
评论 #1463070 未加载
emef将近 15 年前
The entire time I was reading, I hoped that a Haskell solution would be there (knowing it would be much simpler), I got my wish :) +1 to haskell
评论 #1462916 未加载
Jun8将近 15 年前
I agree with everyone that Perl syntax can look random gibberish, however, their particular Perl example seems quite easy to interpret.
评论 #1462614 未加载
评论 #1462611 未加载
kroger将近 15 年前
The lisp code has a few problems. It's using a regex library that's only available in clisp (I think), it's not using the standard input like the other examples, and having two functions named has-suffix and has_suffix is no good. Also, it'll return an error if the string is shorter than the suffix.<p>In the following example I'm using the portable <a href="http://www.cliki.net/SPLIT-SEQUENCE" rel="nofollow">http://www.cliki.net/SPLIT-SEQUENCE</a> to split the words:<p><pre><code> (defun endswith (string suffix) (let ((size (- (length string) (length suffix)))) (unless (minusp size) (equalp (subseq string size) suffix)))) (loop for line = (read-line *standard-input* nil) while line do (loop for word in (split-sequence #\Space line) do (if (endswith word "ing") (write-line word)))) </code></pre> It's still wordier than python, though.
ekiru将近 15 年前
The C and Prolog examples solve a different problem than the others. The others split on either any whitespace or on only spaces. The Prolog example splits on whitespace and punctuation. The C example splits on anything that isn't alphanumeric.
eru将近 15 年前
"LISP is a so-called functional programming language, in which all objects are lists, and all operations are performed by (nested) functions of the form (function arg1 arg2 ...). "<p>Reading this hurts.
tspiteri将近 15 年前
The C version has a buffer overflow if more than 1024 consecutive alphanumeric characters are input. And a much less serious point,<p><pre><code> isalnum(c) </code></pre> looks much better than<p><pre><code> (c &#62;= '0' &#38;&#38; c &#60;= '9') || (c &#62;= 'a' &#38;&#38; c &#60;= 'z') || (c &#62;= 'A' &#38;&#38; c &#60;= 'Z')</code></pre>
评论 #1463078 未加载
zephyrfalcon将近 15 年前
For what it's worth, this could be written in one line of Io:<p><pre><code> File standardInput readToEnd split select(endsWithSeq("ing")) foreach(println) </code></pre> (Given that I don't actually use Io a lot, there might be shorter ways to do this.)<p>Anyway, as usual, code samples prove very little. =)
10ren将近 15 年前
lua (a better version is welcome):<p><pre><code> for line in io.lines() do for word in line:gfind("[^%s]+") do if word:find("ing$") then print( word ) end end end</code></pre>
pgbovine将近 15 年前
i love python more than anything else in the world (well, almost), but i think that this example is quite superficial ... this line alone gives python its enormous 'readability edge' over other languages:<p><pre><code> if word.endswith('ing') </code></pre> of course, it's great standard library design to have a string method called endswith() rather than making people use a regexp ending in '$', since finding suffixes is a common operation. but such a simple operation is hardly indicative of hardcore NLP (which would mostly be hidden in special-purpose library code anyways)
mark_l_watson将近 15 年前
I use Ruby, and not Python. That said I still bought a print copy of this book a few years ago: nice book and the NLTK package has a lot of grate tools built in. Definitely "batteries included."
评论 #1462607 未加载
urza将近 15 年前
C#<p><pre><code> Console.ReadLine().Split().Where( word =&#62; word.EndsWith("ing")) .ForEach( word =&#62; Console.WriteLine(word));</code></pre>
评论 #1465686 未加载