Show HN: Regex Cheatsheet

499 pointsby geongeorgekover 5 years ago

36 comments

OK, these kinds of regex tools get posted quite often. I get it, regex is very confusing at first. And some of these use-cases result in rather complex expressions nobody should be forced to write from scratch (you are still remembering to write unit tests for them though, right?)But as someone who actually knows [some flavours of] regex fairly well, what I would really like, is a reference that covers all the subtle differences between the various regex engines, along with community-managed documentation (perhaps wiki pages) of which applications & API versions use which flavour of regex.For example, the other day I wanted to run a find on my NAS. I needed to use a regex, but the Busybox version of find doesn't support the iregex option, so all expressions are case-sensitive. With some googling, I was able to find out that the default regex type is Emacs, but I wasn't able to find either a good reference for exactly what Emacs regex does and doesn't support, nor any information about how to set the "i" flag. In the end I had to manually convert every character into a class (like [aA] for "a") which was tedious, but quicker than trying to find a better solution or resorting to grep.A related, annoyingly common pattern is that the documentation for `find` states that `--regex` specifies a regex, but it does not state which flavour of regex. The documentation for certain versions of `find`, which support alternative engines, note that the default is Emacs. From this I was able to infer (perhaps wrongly) that the Busybox `find` uses Emacs-flavoured regex, but ultimate I still had to resort to some trial-and-error. This problem is all too common in API documentation.

评论 #22203374 未加载

评论 #22203134 未加载

评论 #22204571 未加载

评论 #22203713 未加载

评论 #22207243 未加载

评论 #22202983 未加载

评论 #22206589 未加载

评论 #22209644 未加载

评论 #22204652 未加载

评论 #22203605 未加载

crispyambulanceover 5 years ago

I use regex a lot but deliberately keep it simple.One thing that confounded me often was positive and negative look-arounds. I always got the expressions mixed up, until I just put the expressions into a table like this...<pre><code> look-behind | look-ahead ------------------------------------ positive (?<=a)b | a(?=b) ------------------------------------ negative (?<!a)b | a(?!b) </code></pre> It's not hard, but for whatever reason my brain had trouble remembering the usage because every time I looked it up, each of those expressions was nested in a paragraph of explanation, and I could not see the simple intuitive pattern.Putting it into a simple visualization helps a lot.Now, if I can find a similar mnemonic for backreferences !?

评论 #22202491 未加载

评论 #22201759 未加载

darau1over 5 years ago

Nobody pointed it out, but there's also <a href="https://regexr.com/" rel="nofollow">https://regexr.com/</a>It's how I learned regex years ago, and I still use it today to test/build more complex patterns.

评论 #22202108 未加载

评论 #22202105 未加载

评论 #22202586 未加载

评论 #22202176 未加载

评论 #22202687 未加载

__tk__over 5 years ago

I'm loving the graphs which for the first time in years are giving me an idea of what an expression is actually doing. Just because the visualization is kept in a form that is easy to understand with a programming background but can also be translated to the expression itself in a straightforward manner.

评论 #22201666 未加载

评论 #22201524 未加载

评论 #22201935 未加载

geongeorgekover 5 years ago

I used to spend hours trying to craft the perfect expression for my scraping projects not realizing that I don't really know regex.This tool is a cheat sheet that also explains the commonly used expressions so that you understand it.- There is a visual representation of the regular expression (thanks to regexpr)- The application shows matching strings which you can play around- Expressions can be edited and these are instantly validated

StavrosKover 5 years ago

I love regex and have no trouble reading them, but still love this tool, great job. I especially like the railroad diagrams, for those cases where I brainfarted on a regex and it's doing something other than what I intended. Thanks for this.

评论 #22201518 未加载

lfglopesover 5 years ago

I used to use this site <a href="http://txt2re.com" rel="nofollow">http://txt2re.com</a> which is now off the grid, at the least since yesterday. :(Unlike most regex helpers, in this one you would start with the text you want to filter/parse and then it would suggest you possible extractions.Do you know any alternatives?

评论 #22244588 未加载

rubyn00bieover 5 years ago

Nice work on this!Something subtle, but I quite loved the email regex is, IMHO, close to perfect: \S+@\S+\.\S+Because the "perfect" one is just absurd, and no one realizes it's going to be so fucking absurd until they start getting support cases and then go read something like this: <a href="https://stackoverflow.com/a/201378/931209" rel="nofollow">https://stackoverflow.com/a/201378/931209</a>> If you want to get fancy and pedantic, implement a complete state engine. A regular expression can only act as a rudimentary filter. The problem with regular expressions is that telling someone that their perfectly valid e-mail address is invalid (a false positive) because your regular expression can't handle it is just rude and impolite from the user's perspective.

评论 #22205872 未加载

philshemover 5 years ago

I have a secret hobby of answering python + regex questions on stackoverflow with pure python.

评论 #22202571 未加载

评论 #22204653 未加载

vzidexover 5 years ago

Very cool! The site that worked best for me to learn regex was <a href="https://regexcrossword.com/" rel="nofollow">https://regexcrossword.com/</a> - after solving my way through all of them (I got really hooked when I discovered the site) I found I was alright at regex.

评论 #22202577 未加载

adambowlesover 5 years ago

>/h.llo/ the '.' matches any one character other than a new line character... matches 'hello', 'hallo' but not 'h llo'in the cheatsheet is false. (<a href="https://regexr.com/4tc48" rel="nofollow">https://regexr.com/4tc48</a>)`.` can match any character except linebreaks (including whitespace)

评论 #22206772 未加载

dana321over 5 years ago

One thing i've always missed from the Perl programming language is the regex operators.You could do:<pre><code> my $var='foo foo bar and more bar foo!!!'; if($var=~/(foo|bar)/g){ # does the variable contain foo or bar? print "foo! $1 removing foo..\n"; # remove our value.. $var=~s/$1//g; }</code></pre>

评论 #22204481 未加载

asicspover 5 years ago

neat site! clicking an example opens up a playground with live update and explanation and railroad diagrams, similar to sites like regex101[1] and regulex[2]one suggestion would be to mention clearly which tool/language is being used, regex has no unified standard.. based on "Cheatsheet adapted" message at the bottom, I think it is for JavaScript. I wrote a book on js regexp last year, and I have post for cheatsheet too [3][1] <a href="https://regex101.com/" rel="nofollow">https://regex101.com/</a>[2] <a href="https://jex.im/regulex" rel="nofollow">https://jex.im/regulex</a>[3] <a href="https://learnbyexample.github.io/cheatsheet/javascript/javascript-regexp-cheatsheet/" rel="nofollow">https://learnbyexample.github.io/cheatsheet/javascript/javas...</a>

评论 #22202274 未加载

Glenchover 5 years ago

Plug for Verbal Expressions (no affiliation), which has an alternate way of compiling more human-readable regexes for a dozen languages: <a href="http://verbalexpressions.github.io/" rel="nofollow">http://verbalexpressions.github.io/</a>

评论 #22201882 未加载

评论 #22201589 未加载

评论 #22201557 未加载

mimixcoover 5 years ago

This is awesome! Thank you! I hate regex, too, but I love your inline railroad diagramming tool.

评论 #22201622 未加载

superasnover 5 years ago

Regex are quite simple and useful but my only issue is with those recursive things. Like how do you match balanced brackets? I have a regex (pcre) copy-pasted for it but for the life of me I don't get it or maybe nod my head but instantly ununderstand it. I wish there was a simple to understand doc that teaches to me how I can match something like:<pre><code> "(this is inside a bracket (and this is nested or (double nested))) </code></pre> P.S. I know token parsing is better for these things but still I just want to learn the other thing too.

评论 #22203705 未加载

评论 #22203630 未加载

xxsaculxxover 5 years ago

Nice tool! I personally use <a href="https://regex101.com/" rel="nofollow">https://regex101.com/</a> as I like the explanations and quick reference.

sylvanaarover 5 years ago

Nothing will ever beat RegexBuddy when it comes to Regex tools. It is an entire IDE just for regex, and has been my not-so-secret weapon for a decade or more.

kitdover 5 years ago

This is really cool!2 points:1. it fiddled with my back button which is a bit annoying2. a better email sample is<pre><code> ^[^@]+@[^@]+\.[^@]+$ </code></pre> which removes the 2 ampersands problem.

评论 #22202195 未加载

评论 #22201855 未加载

评论 #22201617 未加载

评论 #22202377 未加载

dan_hawkinsover 5 years ago

Is there a bug? In regexp for IPv4: <a href="https://ihateregex.io/expr/ip" rel="nofollow">https://ihateregex.io/expr/ip</a> expression ends with {3} but the diagram states "2 times" in lower right - shouldn't it say "3 times"?

评论 #22202016 未加载

KenanSulaymanover 5 years ago

I don't understand why the Github repository lists regexper as the source of the visual graph code but the frame only shows iHateRegex as watermark?If the only thing that is embedded in that frame was taken entirely from a different project, that project should at least be mentioned in the frame.

hyperpapeover 5 years ago

Really nice idea.I found that you can see your own regex with railroad diagram by going to one of the prepopulated examples and editing it. However, it wasn't clear to me that's the intended use of the tool. It's either a little side-effect, or not super-discoverable.

mNovakover 5 years ago

I always refer back to <a href="http://rexegg.com/" rel="nofollow">http://rexegg.com/</a> Not a tool as such, but a good reference if you know how it works and just need to refresh on syntax.

kazinatorover 5 years ago

There is no way I would just plop that IPv6 regex into any serious program. :)

Ditiover 5 years ago

For the love of god, PLEASE DON’T USE REGEX TO VALIDATE EMAIL. The RegEx of this website ignores plus-addressing, for example. All you need to do to validate email is send a verification email.

评论 #22210134 未加载

axegonover 5 years ago

This is awesome but.... I don't hate regex. Matter of fact, I love regex.

评论 #22208414 未加载

Amarokover 5 years ago

^[a-z0-9_-]{3,15}$The username reference doesn't match 16 characters as claimed

评论 #22202048 未加载

chensterover 5 years ago

For email specific regular expression, it's all covered on <a href="https://emailregex.com" rel="nofollow">https://emailregex.com</a>

binarysneakerover 5 years ago

These regexs are garbage. Others have suggested better sites for learning how to construct regexs, and stackoverflow has plenty of great examples.

评论 #22204437 未加载

olalondeover 5 years ago

Thumbs up for the relatable domain name.

评论 #22208411 未加载

esaymover 5 years ago

Either I'm a regex wizard and don't know it, or perhaps I think I know something but know nothing at all but I've never complained about using regex expressions. I use them all the time without thought. Never quite figured out the need for a cheatsheet either, your language of choice should have a good documentation page for any specific supported syntax.

hamid_raover 5 years ago

love the idea! I would crowdsource it so people can add their regex and vote on other people rexgexes!

ape4over 5 years ago

The IPv6 regex is surprisingly complicated.

评论 #22202051 未加载

samatover 5 years ago

This is very neat, thank you!

blauditoreover 5 years ago

Would be nice to have a regex for parsing HTML...grabs popcorn

评论 #22202287 未加载

评论 #22202223 未加载

评论 #22205795 未加载

评论 #22202280 未加载

shawnyouover 5 years ago

Good tool