A regular expression to check for prime numbers

192 pointsby Moyamoover 10 years ago

18 comments

btillyover 10 years ago

This regular expression is due to Abigail of Perl fame. Who is still very active, see <a href="http://search.cpan.org/~abigail/" rel="nofollow">http://search.cpan.org/~abigail/</a> for more.However while cute, it is very slow. It tries every possible factorization as a pattern match. When it succeeds, on a string of length n that means that n times it tries to match a string of length n against a specific pattern. This is O(n^2). Try it on primes like 35509, 195341, 526049 and 1030793 and you can observe the slowdown. (Some regular expression engines may employ tricks that speed this up over the naive analysis, but none will be super speedy about it.)Usual arithmetic is much more efficient. :-)

评论 #9042019 未加载

评论 #9041874 未加载

评论 #9043195 未加载

anyfooover 10 years ago

For some this will be obvious, but for others it may still be interesting:If this was a "true" regular expression, in the sense that it describes a regular language (<a href="http://en.wikipedia.org/wiki/Regular_language" rel="nofollow">http://en.wikipedia.org/wiki/Regular_language</a>), then checking whether the number is prime or not, by checking if the expression matches, would run in linear time. This would be worth a few nobel prizes at least.Unfortunately, it is an "extended" regular expression, i.e. not regular at all in the strictest sense, and (as the article says) requires backtracking.

评论 #9041880 未加载

评论 #9041829 未加载

评论 #9042048 未加载

评论 #9041835 未加载

xyzzyzover 10 years ago

It has been posted before [2], so I'll repost my comment from back then [3]:It's cool, but it's regexp, not regular expression -- regular expressions cannot test for primeness. The proof is a simple application of pumping lemma[1]: we want to show that language { a^p: p prime } is not regular. Suppose it is. By pumping lemma we have numbers N, n >= 1, such that for every k >= 0, we have a^(N+nk) in our language. But that means that for every k, N+nk is prime, and it fails for k = N, since N+nN = N(1+n) is not a prime, since both N and 1+n are greater than 1.[1] - <a href="http://en.wikipedia.org/wiki/Pumping_lemma_for_regular_languages" rel="nofollow">http://en.wikipedia.org/wiki/Pumping_lemma_for_regular_langu...</a> [2] - <a href="https://news.ycombinator.com/item?id=3391547" rel="nofollow">https://news.ycombinator.com/item?id=3391547</a> [3] - <a href="https://news.ycombinator.com/item?id=3391572" rel="nofollow">https://news.ycombinator.com/item?id=3391572</a>

elzrover 10 years ago

This is indeed beautiful. But the author limits itself to a low-level unpacking of the regex and running a few tests cases, all the while remaining mystified as to why this works.A high-level description helps makes obvious what's happening:<pre><code> To check the primality of a number N with string matching, repeat a string (say, "1") N times and declare N prime: UNLESS the repetitions match the string 0 or 1 times OR the repetitions can be EVENLY matched into groups of MORE than one string.</code></pre>

评论 #9042131 未加载

andrewflnrover 10 years ago

Uses back-references, so it doesn't describe a regular language in the strict sense. Still, it's pretty cool to use them to trick a string-processing engine to do integer factoring.

评论 #9041837 未加载

评论 #9041717 未加载

tom-lordover 10 years ago

I'm currently working on a ruby gem to generate examples of given regular expression. For example:<pre><code> /this|is|awesome/.examples # => ["this", "is", "awesome"] </code></pre> So, I tried it out on this "prime number validator" regex:<pre><code> /1?|(11+?)\1+/.examples #=> ["", "1", "1111", "111111", "11111111"] /1?|(11+?)\1+/.examples(max_repeater_variance: 50, max_group_results: 100).uniq.sort.map(&:length).first(10) #=> [0, 1, 4, 6, 8, 9, 10, 12, 14, 15] require 'prime' Array(1..100) - /1?|(11+?)\1+/.examples(max_repeater_variance: 50, max_group_results: 3000).uniq.sort.map(&:length) == Prime.first(25) #=> true </code></pre> Horribly inefficient, but pretty cool! Here's my gem if case anyone's interested: <a href="https://github.com/tom-lord/regexp-examples" rel="nofollow">https://github.com/tom-lord/regexp-examples</a>

评论 #9047535 未加载

charlesackninover 10 years ago

Nice (abuse of regexp) !AFAICT the way the string is going to be parsed is equivalent to this:<pre><code> def isprime(n): if n == 0 or n == 1: return False i = 2 while i <= n: if (n - i) % i == 0: return False i += 1 return True </code></pre> A parser implementation may optimize and do "while (n - i) >= 1" instead, which cuts in half the number of iterations.

评论 #9043196 未加载

fxnover 10 years ago

I explored this technique applied to a few other problems just for fun, integer square root, coprimes, continued fractions, modulo, etc.<a href="https://github.com/fxn/math-with-regexps/blob/master/one-liners.sh" rel="nofollow">https://github.com/fxn/math-with-regexps/blob/master/one-lin...</a>

nsajkoover 10 years ago

It uses a string of '1' as a unary numeral representation of a number and then checks it for divisibility with 2,3,4... using brute force.

logicalleeover 10 years ago

less sexy when you write<pre><code> /^(11+?)\1+$/ is a prime test on whether a unary string > 2 is composite </code></pre> then people have a real chance of figuring it out. (go ahead and try, if you haven't read the article.)As it stands that 1 x shift is super confusing. who would have thought 1 would become a literal! we wouldn't even write it that way in english.

banticover 10 years ago

The thing that made this click for me was realizing the "1" in the regex was not special. The regex is just looping through integers (starting with 2) and checking if the target number is divisible by any of them. It can use any digit if the target string uses the same digit. Ruby example using 5 instead of 1:def is_prime(n); str = "5"*n; str !~ /^5?$|^(55+?)\1+$/; endThe regex essentially says, for integer n, "does 2 ones fit evenly into n ones", then "does 3 ones fit into n ones", etc., which is the same as "does 2 fit into n", "does 3 fit into n", etc.

avinashover 10 years ago

Thanks for sharing. This is a post I wrote way back in 2007 and, even though, as many have pointed out, the algorithm is very inefficient, it is still a good illustration of how powerful regular expressions are.When I am explaining the basics of regular expressions to people, I always refer them to this post to make them realize that regular expressions are not only glorified search terms. In general, they are amazed :-)

ekimekimover 10 years ago

An aspect of the syntax is confusing me here. I would have assumed that the expression 'x+?' would be parsed as '((x)+)?', ie. 'x*'. However, it seems the parser special-cases this as (x)(+?) instead, and interprets that as a non-greedy operator. Isn't this horribly inconsistent?

_lce0over 10 years ago

nice trick!here a JS implementation<pre><code> function isPrime(n) { var re = /^1?$|^(11+?)\1+$/; var s = []; while (n -- > 0) s.push(1); return !re.test(s.join('')); }</code></pre>

评论 #9041833 未加载

Amorymeltzerover 10 years ago

We all learned this way back with the Regex Game: <a href="http://regex.alf.nu/6" rel="nofollow">http://regex.alf.nu/6</a>

TheLoneWolflingover 10 years ago

Is there a way to write a (non-regular, of course) regular expression that checks for prime numbers in a non-unary base?

评论 #9043881 未加载

geertjover 10 years ago

This one has been going around for a while. I'd be curious to see a base-2 or base-10 variant of it.

okeyover 10 years ago

Evidently beauty is in the eye of the beholder.