I don't know if there was any strict requirement to use Python, but you mention the difficulty in finding a library to parse regular expressions.<p>One of the many great features of the Rust regex crate (<a href="https://doc.rust-lang.org/regex/regex/index.html" rel="nofollow">https://doc.rust-lang.org/regex/regex/index.html</a>) is that there's actually a regex_syntax crate that provides this parsing support (<a href="https://doc.rust-lang.org/regex/regex_syntax/index.html" rel="nofollow">https://doc.rust-lang.org/regex/regex_syntax/index.html</a>), which can be used for purposes like this, to do alternative kinds of matching engines.<p>Rust's regex library also has pretty good DFA and NFA matching engines, and an Aho-Corasick engine for parts of the regex that can be decomposed into just a set of simple strings to match, as well as some fairly well optimized exact substring search to speed up search for expressions with some substring that must match exactly, so it can be fairly fast for the kinds of matches like your "TAG(A|C|T|G)GG" example.<p>Another, even more radical change, would be to build a finite state transducer of your data set, which you can then do regex matching on directly by taking a union of an automaton representing the regex and the FST representing your data (which is itself an automaton). I learned about FSTs from the author of the Rust regex crate: <a href="http://blog.burntsushi.net/transducers/" rel="nofollow">http://blog.burntsushi.net/transducers/</a>. An FST is actually used internally for certain types of searches in Elasticsearch/Lucene (<a href="https://www.elastic.co/guide/en/elasticsearch/reference/current/search-suggesters-completion.html" rel="nofollow">https://www.elastic.co/guide/en/elasticsearch/reference/curr...</a>). I don't know if these would be suited to this problem as is by just building an FST from your set of sequences, or maybe using the overlapping subsequences idea to make FSTs from a larger number of shorter strings, but it would be an interesting project to explore.