It should be noted that this is not possible with regular expressions in the traditional sense (i.e. regular expressions only matching regular languages): <a href="http://math.stackexchange.com/a/181233/21405" rel="nofollow">http://math.stackexchange.com/a/181233/21405</a><p>Because of back references PCRE regular expressions can match non-regular languages as well.
tl;dr if you know a little regex<p><pre><code> - convert your number to a string of that length (1 = 1, 2 = 11, 3 = 111)
- handle 0 / 1 separately
- ^(..+?)\1+$
- trick; the WHOLE thing has to match to return (^$)
- first \1 match is "11", ergo string must be "11", "1111", "111111" to match
- second \1 match "111", ergo string must be "111", "111111", "111111111" to match
- and so on. if you find before length of string, it was not prime.
</code></pre>
Clever trick. Look forward to being asked it in your next google interview :)
Invented in 1998 by Perl hacker Abigail:<p><a href="http://neilk.net/blog/2000/06/01/abigails-regex-to-test-for-prime-numbers/" rel="nofollow">http://neilk.net/blog/2000/06/01/abigails-regex-to-test-for-...</a><p>(check out Abigail's other JAPHs if you like stuff like this)
This regexp is totally brilliant. It solves this problem in an unexpected way, because you think numbers, but the solution thinks strings. Not only that, the paradox, or its beauty, is that it goes to the root of the number abstraction: counting sticks.<p>I explored this technique some years later to solve a bunch of other problems, coprimality, prime factorization, Euler's phi, continued fractions, etc. (see <a href="https://github.com/fxn/math-with-regexps/blob/master/one-liners.sh" rel="nofollow">https://github.com/fxn/math-with-regexps/blob/master/one-lin...</a>).
> How would we go about that? Well, all we have to do is add ? in front of the +. This will lead us to the <.+?> regex.<p>I was very confused until I realized the author's definition of 'in front' wasn't the same as mine...
Is it just me or the author of this blog post has a very annoying way of writing? He keeps explaining things and then a couple of lines below he writes something like "remember the discussion above?" or "if you remember correctly ...". Hell, sure I remember, because I just read about this 10 seconds ago. Do people really have such a short attention span and can't remember what was in a previous paragraph? Repeating stuff over and over seems almost like content farming to me.
Very interesting. Thanks for writing this up.<p>This is a bit more Perlish (not that I'm an authority):<p><pre><code> sub is_prime {
(1x$_[0]) !~ /^(?:.?|(.{2,}?)\1+)$/;
}
</code></pre>
But perhaps less clear to a novice reader. You can also leave off the semicolon.
I suppose that this has been around for a while:<p><a href="https://news.ycombinator.com/item?id=9039537" rel="nofollow">https://news.ycombinator.com/item?id=9039537</a><p><a href="http://montreal.pm.org/tech/neil_kandalgaonkar.shtml" rel="nofollow">http://montreal.pm.org/tech/neil_kandalgaonkar.shtml</a>
Performing a Miller-Rabin primality test or filling a Sieve of Eratosthenes will almost always be the way to go. However, this line of code seems like magic and that's kind of impressive.
Why does the example given by the author with L=15 state that the regex matches once it reaches 5, instead of 3?<p>> So first, we’ll be testing the divisibility by 2, then by 3, then by 4 and then by 5, after which we would have a match.
It's basically the Sieve of Eratosthenes[1] algorithm, and it's possible to do it with regular expressions because numbers here are represented in unary[2] number system, where number of characters/tokens equals the number itself. It's a common trick for testing various Turing-machine stuff.<p>[1] <a href="https://en.wikipedia.org/wiki/Sieve_of_Eratosthenes" rel="nofollow">https://en.wikipedia.org/wiki/Sieve_of_Eratosthenes</a>
[2] <a href="https://en.wikipedia.org/wiki/Unary_numeral_system" rel="nofollow">https://en.wikipedia.org/wiki/Unary_numeral_system</a>
Can't quite get this to work in vim [1]:<p><pre><code> \(^1\{-,1}$\)\|\(^\(11\+\)\1\+$\)
</code></pre>
The two parts match what you'd expect on their own but the OR-ing screws it up: it means the whole regex matches <i>everything</i>.<p>Is this something vim gets right and every other engine wrong, the other way around, or...?<p>_____________<p>[1] I'm matching 1's rather than dots to avoid highlighting every bit of text ever anywhere all the time.<p>Also, that way it's so much more pleasing to the eye and easy to read, don't you think?