TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

The absolute bare minimum every programmer should know about regular expressions

167 点作者 rayvega将近 15 年前

19 条评论

philwelch将近 15 年前
Slight terminological quibble:<p><i>Regular expressions are strings formatted using a special pattern notation that allow you to describe and parse text.</i><p>"Parse" is a questionable word choice. A proper Regular Expression can only describe patterns found in "regular languages", though modern backtracking regex engines have a bit more power. Parsing implies a <i>parser</i>, which means matching a grammar somewhat more sophisticated than a regex engine can handle.
评论 #1531715 未加载
评论 #1530981 未加载
评论 #1531694 未加载
评论 #1531028 未加载
lbrandy将近 15 年前
Considering he listed just about everything I can remember about regexes without google, and nothing more, I believe I'm duty bound to upvote.<p>My only quibble: I'd promote the character classes (especially \s and maybe \S) into this list from the "expert" category. Here's part2 (or what I call: stuff I can google later, if I need it), if anyone wants to keep reading: <a href="http://immike.net/blog/2007/06/21/extreme-regex-foo-what-you-need-to-know-to-become-a-regular-expression-pro/" rel="nofollow">http://immike.net/blog/2007/06/21/extreme-regex-foo-what-you...</a>
评论 #1531448 未加载
cousin_it将近 15 年前
The title makes me wonder... I was an okay programmer before I learned about regexes. When the need arose, I learned how to use them easily. Should I have learned them before that?<p>Many programmers seem to think that they possess just the right amount of knowledge. Someone who knows Unix utilities but not algorithm design will argue that Unix utilities are vital, but algorithm design is a random obscure topic; someone with the opposite skillset will defend it just as eloquently. I think programmers have no obligation to know anything. If you can do the job, I don't care how much you use Google.<p>I'm spoiled enough to think that boring things shouldn't be deliberately memorized. It's enough for me to know the mathematical concept of a regular language, which lets me recognize situations where regexes would help. For concrete syntax we always have cheatsheets. This attitude has its advantages: for example, I never needed to be explained why parsing XML with regexes is a bad idea.
评论 #1532511 未加载
snitko将近 15 年前
It's very disappointing to find the author didn't mention greedy matchers. Consider the difference:<p><pre><code> &#62;&#62; s = "hello world, I love you world" &#62;&#62; s.sub(/(.*?)world/, '\1universe') =&#62; "hello universe, I love you world" &#62;&#62; s.sub(/(.*)world/, '\1universe') =&#62; "hello world, I love you universe"</code></pre>
评论 #1531103 未加载
brehaut将近 15 年前
I prefer fishbowl's law<p><pre><code> Every regexp that you apply to a particular block of text reduces the applicability of regular expressions by an order of magnitude. </code></pre> - <a href="http://fishbowl.pastiche.org/2003/08/18/beware_regular_expressions/" rel="nofollow">http://fishbowl.pastiche.org/2003/08/18/beware_regular_expre...</a><p>The minimum any programmer needs to know about regexps is when they are applicable, and more importantly, when they are not.
评论 #1530951 未加载
评论 #1530871 未加载
meric将近 15 年前
It's quite a good guide but I had thought it was going to contain warnings like "Don't use regular expressions to parse XML" and then explain why.
joubert将近 15 年前
In Python, you can get the regex parse tree to help you debug your regex expressions: <a href="http://stackoverflow.com/questions/101268/hidden-features-of-python/143636#143636" rel="nofollow">http://stackoverflow.com/questions/101268/hidden-features-of...</a>
klochner将近 15 年前
I learned something new - disjunction is sometimes called "alternation":<p><a href="http://www.britannica.com/EBchecked/topic/165607/disjunction" rel="nofollow">http://www.britannica.com/EBchecked/topic/165607/disjunction</a>
pmiller2将近 15 年前
One thing everyone should know about regular expressions is that the things called "regular expressions" provided by libpcre (which, of course, stands for "Perl compatible regular expressions") aren't regular expressions at all. Since they support backtracking, they provide a more powerful parsing framework than regular expressions alone (which don't support backtracking). That's how the (in)famous regular expression that recognizes prime numbers can work, even though the language of prime numbers is not regular.
mbateman将近 15 年前
For some reason, I've never been able to get the knack of regular expressions. This is the best introduction to them that I've seen, though I wish I'd read it years ago. However, if you've tinkered with regex at all ever, which I suspect virtually everyone here has, you'll know this. (I know it, and I hate and avoid regular expressions and am not even a programmer.)
评论 #1531606 未加载
评论 #1530863 未加载
评论 #1531807 未加载
SteveC将近 15 年前
When writing regular expressions I always use this website to test them out.<p><a href="http://www.rubular.com" rel="nofollow">http://www.rubular.com</a>
raffi将近 15 年前
I learned regular expressions when writing the documentation for my Sleep programming language. They never made sense to me until I had to organize my thoughts on them in a logical way. I'm still quite happy with how that chapter turned out:<p><a href="http://sleep.dashnine.org/manual/regex.html" rel="nofollow">http://sleep.dashnine.org/manual/regex.html</a>
JoelSutherland将近 15 年前
Here's a cool online tool my friend (HN user KrisJordan) made:<p><a href="http://www.gethifi.com/tools/regex" rel="nofollow">http://www.gethifi.com/tools/regex</a><p>It's super handy when learning Regular Expressions because it shows results as you type.
评论 #1531402 未加载
ez77将近 15 年前
While we’re at it, could someone please explain why \[[^\[\]]*\] does not match things like [X], where X is a string of non-square-bracket characters? How would you fix it? Thanks!
评论 #1531431 未加载
pmccool将近 15 年前
I was surprised to find no mention of the issues with nesting. Every programmer should know about this limiation, even if they don't get into the pumping lemma and whatnot.
lanstein将近 15 年前
(2007)<p>kind of disappointed that there's 13 comments and no mention of jwz's quote.<p>in terms of actual thoughts, this is certainly an aptly named article.
评论 #1531014 未加载
crpatino将近 15 年前
1. Regular expressions cannot match recursive patterns.<p>2. No, they cannot! extended regexps are a nice hack, but not really met formal regular expression definition.<p>3. man grep
d0m将近 15 年前
Did someone learn something on this site? Why it's on hacker news instead of newbies news.
dbz将近 15 年前
Meh, no one here wants to here my spoiled brat opinion, so I apologize in advance:<p>It should also be included in the bare minimum:<p>Regex will drive you crazy. Not just writing the expression, but installing a library (especially if there is no handy installer) ect. Regex likes to punch you in the balls whenever possible. (I can say this because I've written many line of Regex and I <i>do</i> love it.)<p>.<p>.<p>Some people, when confronted with a problem, think "I know, I'll use regular expressions." Now they have two problems. -Jamie Zawinski