A few months ago: <a href="https://news.ycombinator.com/item?id=22745351" rel="nofollow">https://news.ycombinator.com/item?id=22745351</a><p>2019: <a href="https://news.ycombinator.com/item?id=19214387" rel="nofollow">https://news.ycombinator.com/item?id=19214387</a>
I would consider Daniel Lemire (the main author) quite an authority within practical use of vectorization (SIMD). He is a computer science professor at Université du Québec. And is also behind the popular Roaring Bitmaps project [1]. You can check out his publication list here [2].<p>[1] <a href="https://roaringbitmap.org/" rel="nofollow">https://roaringbitmap.org/</a><p>[2] <a href="https://lemire.me/en/#publications" rel="nofollow">https://lemire.me/en/#publications</a>
Gigabytes per second can be a worrying statistic. It suggests that benchmarks would be parsing massive json files rather than the small ones that real-world applications deal with.<p>However this library maintains roughly constant throughput for both small (eg 300 byte) and large documents, if it’s benchmarks are accurate.
The GitHub page links to a video that explains some of the internals [1]. Can someone comment on the result that they show at 14:26?<p>My understanding is that they run a code that does 2000 branches based on a pseudo-random sequence. Over around 10 runs of that code, the CPU supposedly learns to correctly predict those 2000 branches and the performance steadily increases.<p>Do the modern branch predictors really have the capability to remember an exact sequence of past 2000 decisions on the same branch instruction? Also, why would the performance increase incrementally like that? I would imagine that it would remember the loop history on the first run and achieve maximum performance on the second run.<p>I doubt that there's really a neural net in the silicon doing this as the author speculates.<p>[1] <a href="https://youtu.be/wlvKAT7SZIQ?t=864" rel="nofollow">https://youtu.be/wlvKAT7SZIQ?t=864</a>
For other folks interested in using this in Node.js, the performance of `simdjson.parse()` is currently slower than `JSON.parse()` due to the way C++ objects are converted to JS objects. It seems the same problem affects a Python implementation as well.<p>Performance-sensitive json-parsing Node users must do this instead:<p><pre><code> require("simdjson").lazyParse(jsonString).valueForKeyPath("foo.bar[1]")
</code></pre>
<a href="https://github.com/luizperes/simdjson_nodejs/issues/5" rel="nofollow">https://github.com/luizperes/simdjson_nodejs/issues/5</a>
SQLite can seemingly parse and process gigabytes of JSON per second. I was pretty shocked by its performance when I tried it out the other month.I ran all kinds of queries on JSON structures and it was so fast.
An idea I had a few years ago which someone might be able to run with is to develop new charsets based on the underlying data, not just some arbitrary numerical range.<p>The idea being that characters that are more common in the underlying language would be represented as lower integers and then use varint encoding so that the data itself is smaller.<p>I did some experiments here and was able to compress our data by 25-45% in many situations.<p>There are multiple issues here though. If you're compressing the data anyway you might not have as big of a win in terms of storage but you still might if you still need to decode the data into its original text.
And if you're looking for a fast JSON lib for CPython, orjson[1] (written in rust) is the best I've found.<p>[1] <a href="https://github.com/ijl/orjson#performance" rel="nofollow">https://github.com/ijl/orjson#performance</a>
I never thought I’d write this, but we have officially entered a golden age for C++ JSON utils. They are everywhere, and springing up right and left. It is a great time to be alive.
Just noting that this library requires that you are able to hold your expanded document in memory.<p>I needed to parse a very very large JSON document and pull out a subset of data, which didn't work, because it exceeded available RAM.
So what is the fastest JSON library available? orjson claims they are the fastest but they don't benchmark simdjson. simdjson claims they are the fastest but did they forget to benchmark anything?
The author has given a talk last month, which can be viewed on YouTube:<p><a href="https://www.youtube.com/watch?v=p6X8BGSrR9w" rel="nofollow">https://www.youtube.com/watch?v=p6X8BGSrR9w</a>
I use Emacs with lsp-mode (Language Server Protocol) a lot (for haskell, rust, elm and even java) and there was a dramatic speedup from Emacs 27 onwards when it started using jansson JSON parsing.<p>I don't think it's the bottleneck at the moment, but it's good to know there are faster parsers out there. Had a small search but couldn't find any plans to incorporate simdjson, besides a thread from last year on Emacs China forums.
Very impressive. Still there’re couple of issues there.<p>This comment is incorrect:
<a href="https://github.com/simdjson/simdjson/blob/v0.4.7/src/haswell/simd.h#L111" rel="nofollow">https://github.com/simdjson/simdjson/blob/v0.4.7/src/haswell...</a><p>The behavior of that instruction is well specified for all inputs. If the high bit is set, the corresponding output byte will be 0. If the high bit is zero, only the lower 4 bits will be used for the index. Ability to selectively zero out some bytes while shuffling is useful sometimes.<p>I’m not sure about this part:
<a href="https://github.com/simdjson/simdjson/blob/v0.4.7/src/simdprune_tables.h#L9-L11" rel="nofollow">https://github.com/simdjson/simdjson/blob/v0.4.7/src/simdpru...</a>
popcnt instruction is very fast, the latency is 3 cycles on Skylake, and only 1 cycle on Zen2. It produces same result without RAM loads and therefore without taking precious L1D space. The code uses popcnt sometimes, but apparently the lookup table is still used in other places.
There's an R (#rstats) wrapper as well: <a href="https://github.com/eddelbuettel/rcppsimdjson" rel="nofollow">https://github.com/eddelbuettel/rcppsimdjson</a>
It seems this is for parsing multiple JSONs, each a few MBs at most. What does one do if they have a <i>single</i> 100GB JSON file? :)<p>ie.<p><pre><code> {
// 100GB of data
}</code></pre>