as maintainer of two of the haskell libraries they use, a member of the core libraries committee, and a hackage trustee, i'd like to throw in my two cents:<p>1) i see they are doing fully qualified imports. please add major bounds to your version deps, stack doesn't excuse not tracking what versions you used!<p>2) they have<p>"data Next a
= Done !a
| Step !a
"
in their code,<p>this tells me they could have possibly benefited using the stream abstraction in Vector!<p><a href="https://hackage.haskell.org/package/vector-0.12.0.2/docs/Data-Vector-Fusion-Stream-Monadic.html" rel="nofollow">https://hackage.haskell.org/package/vector-0.12.0.2/docs/Dat...</a> is a link to the latter
I can't speak to the content or Haskell at all but I really like the clean formatting of the post, particularly the charts and images. Does anyone know what tool might have been used to produce them? So clean and crisp.
Impressive work! The clincher is at the end<p>> Relying so much on optimizations to ensure that very high level code compiles to the right low-level code is fragile. Case in point: after upgrading our Stackage snapshot from LTS 10 with GHC 8.2.2 to LTS 13 with GHC 8.6.3, we saw the running time of our benchmark double.<p>This would scare we about doing any real performance optimization in a language as high-level as Haskell: am I just fiddling with compiler internals now?
What actually amazes me is that... Python is good enough:<p>> A naive Python script that loops over all needles and calls str.find in a loop to find all matches.<p>> working through all 3.5 gigabytes of test data once<p>> Time is ~6.5s.<p>I love these kinds of benchmarks. They consistently show that for specialised cases you do need to drop down to C/Rust/hand-written Haskell implementations. In many-many more cases, well, even Python with a naive implementation is good enough.<p>Not to diss Haskell or Rust. Only saying: chose your tools as you see fit, and benchmark.<p>Also kudos to the authors to use mean instead of average in the graphs.
> Clearly there was plenty of room for speeding up our string matching. We considered binding the Rust library, but due to the way that Haskell’s foreign function interface interacts with the garbage collector, passing a string to a foreign function requires copying it. That copy alone puts any foreign library at a significant disadvantage, so we decided to see how close we could get in Haskell alone.<p>It's a shame Haskell copies strings across FFI boundaries, because binding the Rust version would have been almost as fast, with (presumably) significantly less effort.
Converting the function to act as a fold instead of just returning a lazy list, is that necessary because all the unboxing and strictness internally meant the matches couldn't actually be lazy?
It's a pity that Aho-Corasick always seems to be the starting point for this kind of work, because it's as slow as a wet week and/or the tables are huge (pick your poison).<p>To do this right you need a fast prefilter and a reasonable fast string match. Our string matcher in Hyperscan blew AC away on size and speed, typically. Unfortunately, I never had a chance to build its successor. But AC is a disaster of one dependent load after another and is mediocre at best in terms of performance.
This does seem to actively circumvent a number of the things that are always held up as why Haskell is a great language: it uses strict types in a number of places, has explicit inline and no-inline annotations everywhere.<p>It also seemed to require a lot of work to construct the code to make tail call optimization happen, but that’s par for the course in Haskell.
"A naive Python script that loops over all needles and calls str.find in a loop to find all matches. This is not Aho–Corasick [...]".<p>I'm the author of pyahocrasick, a python C extension that implements the AC. I'd love to see it in the comparison. :)