TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

Calculate the difference and intersection of any two regexes

353 点作者 posco超过 1 年前

20 条评论

JoelJacobson超过 1 年前
I created a similar regex web demo that shows how a regex is parsed -&gt; NFA -&gt; DFA -&gt; minimal DFA, and finally outputs LLVMIR&#x2F;Javascript&#x2F;WebAssembly for from the minimal DFA:<p><a href="http:&#x2F;&#x2F;compiler.org&#x2F;reason-re-nfa&#x2F;src&#x2F;index.html" rel="nofollow noreferrer">http:&#x2F;&#x2F;compiler.org&#x2F;reason-re-nfa&#x2F;src&#x2F;index.html</a>
评论 #37475792 未加载
oever超过 1 年前
This library can be used to create string class hierarchies. That, in turn, can help to use typed strings more.<p>For example, e-mails and urls are a special syntax. Their value space is a subset of all non-empty string which is a subset of all strings.<p>An e-mail address could be passed into a function that requires a non-empty string as input. When the type-system knows that an e-mail string is a subclass of non-empty string, it knows that an email address is valid.<p>This library can be used to check the definitions and hierarchy of such string types. The implementation of the hierarchy differs per programming language (subclassing, trait boundaries, etc).
评论 #37473604 未加载
评论 #37479656 未加载
评论 #37478595 未加载
评论 #37473693 未加载
klysm超过 1 年前
Regular expressions are a great example of bundling up some really neat and complex mathematical theory into a valuable interface. Linear algebra feels similar to me.
评论 #37472641 未加载
评论 #37472052 未加载
评论 #37471675 未加载
posco超过 1 年前
The amazing page computes binary relations between pairs of regular expressions and shows a graphical representation of the DFA.<p>It’s a really incredible demonstration of some highly non-trivial operations on regular expressions.
评论 #37470668 未加载
评论 #37472538 未加载
est超过 1 年前
Ha, trying to paste &quot;regex filter numbers divisible by 3&quot; and the page froze to death <a href="https:&#x2F;&#x2F;stackoverflow.com&#x2F;q&#x2F;10992279&#x2F;41948" rel="nofollow noreferrer">https:&#x2F;&#x2F;stackoverflow.com&#x2F;q&#x2F;10992279&#x2F;41948</a><p><pre><code> ^(?:[0369]+|[147](?:[0369]*[147][0369]*[258])*(?:[0369]*[258]|[0369]*[147][0369]*[147])|[258](?:[0369]*[258][0369]*[147])*(?:[0369]*[147]|[0369]*[258][0369]*[258]))+$ ^([0369]|[147][0369]*[258]|(([258]|[147][0369]*[147])([0369]|[258][0369]*[147])*([147]|[258][0369]\*[258])))+$ </code></pre> I wonder if there&#x27;s a shortest one.
评论 #37478233 未加载
评论 #37482710 未加载
layer8超过 1 年前
I wanted to see the intersection between syntactically valid URLs and email addresses, but just entering the URL regex (cf. below) already takes too long to process for the page.<p>[\-a-zA-Z0-9@:%._+~#=]{1,256}\.[a-zA-Z0-9()]{1,6}\b([\-a-zA-Z0-9()@:%_+.~#?&amp;&#x2F;&#x2F;=]*)<p>(source: <a href="https:&#x2F;&#x2F;stackoverflow.com&#x2F;a&#x2F;3809435&#x2F;623763" rel="nofollow noreferrer">https:&#x2F;&#x2F;stackoverflow.com&#x2F;a&#x2F;3809435&#x2F;623763</a>)
评论 #37472922 未加载
jepler超过 1 年前
This is neat!<p>I was surprised then not surprised that the union &amp; intersection REs it comes up with are not particularly concise. For example the two expressions &quot;y.+&quot; and &quot;.+z&quot; have a very simple intersection: &quot;y.*z&quot; (equality verified by the page, assuming I haven&#x27;t typo&#x27;d anything). But the tool gives<p><pre><code> yz([^z][^z]*z|z)*|y[^z](zz*[^z]|[^z])*zz* </code></pre> instead. I think there are <i>reasons</i> it gives the answer it does, and giving a minimal (by RE length in characters or whatever) regular expression is probably a lot harder.
评论 #37476448 未加载
rsstack超过 1 年前
I used this concept once to write the validation logic for an &quot;IP RegEx filter&quot; setting. The goal was to let users configure an IP filter using RegEx (no, marketing people don&#x27;t get CIDRs, and they knew RegEx&#x27;s from Google Analytics). How could I define a valid RegEx for this? The intersection with the RegEx of &quot;all IPv4 addresses&quot; is not empty, and not equal to the RegEx of &quot;all IPv4 addresses&quot;. Prevented many complaints about the filter not doing anything, but of course didn&#x27;t prevent wrong filters from being entered.
评论 #37472866 未加载
pimlottc超过 1 年前
Suggestion: turn off auto suggest in the regex input fields to make it more usable on mobile.<p><a href="https:&#x2F;&#x2F;stackoverflow.com&#x2F;questions&#x2F;35513968&#x2F;disable-autocorrect-in-safari-text-input" rel="nofollow noreferrer">https:&#x2F;&#x2F;stackoverflow.com&#x2F;questions&#x2F;35513968&#x2F;disable-autocor...</a>
x-complexity超过 1 年前
I used 2 similar divide-by-3 regexes to test the page (after removing the ^ and $ to their ends), and it froze up:<p>Regex 1: ([0369]|([258]|[147][0369]*[147])([0369]|([147][0369]*[258]|[258][0369]*[147]))*([147]|[258][0369]*[258])|([147]|[258][0369]*[258])([0369]|([147][0369]*[258]|[258][0369]*[147]))*([258]|[147][0369]*[147]))*<p>Regex 2: ([0369]|[258][0369]*[147]|(([147]|[258][0369]*[258])([0369]|[147][0369]*[258])*([258]|[147][0369]*[147])))*<p>Everything up until the last &#x27;*&#x27; is parsable. The moment I put in the *, the entire page freezes up.<p>Without the *, it produced a valid verifier for parsing chunks of digits whose sum mod 3 = 0.
emmanueloga_超过 1 年前
One possible application: If an input to a function parameter must match a certain regex, and the output of a function produces results matching another regex, we can know if the functions are compatible: if the intersection of regular expressions is empty, then you cannot connect one function to the other.<p>Combined with the fact the regular expressions can be used not only on strings but more generally (e.g. for JSON schema validation [1]), this could be a possible implementation of static checks, similar to &quot;design by contract&quot;.<p>--<p>1: <a href="https:&#x2F;&#x2F;www.balisage.net&#x2F;Proceedings&#x2F;vol23&#x2F;html&#x2F;Holstege01&#x2F;BalisageVol23-Holstege01.html" rel="nofollow noreferrer">https:&#x2F;&#x2F;www.balisage.net&#x2F;Proceedings&#x2F;vol23&#x2F;html&#x2F;Holstege01&#x2F;B...</a>
baggy_trough超过 1 年前
I love how it looks like a CS textbook.
评论 #37472326 未加载
评论 #37472151 未加载
simlevesque超过 1 年前
Kinda related but I&#x27;m looking for something that could give me the number of possible matching strings for a simple regex. Does such a tool exist ?
评论 #37471180 未加载
评论 #37472170 未加载
评论 #37474926 未加载
评论 #37477517 未加载
评论 #37473050 未加载
评论 #37472045 未加载
评论 #37472188 未加载
评论 #37471215 未加载
评论 #37473494 未加载
_a_a_a_超过 1 年前
Any def for &#x27;difference and intersection of regexes&#x27; might actually mean?<p>I guess for regexes r1 and r2 this means the diff and intersect of their extensional sets, expressed intensionally as a regex. I guess. But nothing seems defined, including what ^ is, or &gt; or whatever. It&#x27;s not helpful
评论 #37474117 未加载
less_less超过 1 年前
Interesting. I think this problem is actually EXPSPACE-complete in general? But still has a straightforward algorithm.<p><a href="https:&#x2F;&#x2F;en.wikipedia.org&#x2F;wiki&#x2F;EXPSPACE" rel="nofollow noreferrer">https:&#x2F;&#x2F;en.wikipedia.org&#x2F;wiki&#x2F;EXPSPACE</a>
评论 #37475522 未加载
blibble超过 1 年前
it always bugged me as a student that had to sit through all those discrete maths lectures that standard regex libraries don&#x27;t allow you to union&#x2F;intersect two &quot;compiled&quot; regular expression objects together<p>(having to try them one an a time is pretty sad)
snoble超过 1 年前
Oh neat, this is scala via scalajs.
hoten超过 1 年前
On mobile: are the rectangle glyphs as suffixes on the states on purpose or am I missing a font?
评论 #37472299 未加载
themusicgod1超过 1 年前
ugh STOP USING GITHUB
haltist超过 1 年前
Can LLMs do this?
评论 #37473478 未加载