TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

On-demand JSON: A better way to parse documents?

166 点作者 warpech超过 1 年前

15 条评论

kristianp超过 1 年前
So they&#x27;re creating a DOM-like api in front of a sax style parser and getting faster results (barring FPGA and GPU research). It&#x27;s released as part of SIMDJson.<p>I wonder if that kind of front end was done in the age of SAX parsers?<p>Such a well-written paper.
评论 #39331807 未加载
评论 #39329430 未加载
eternityforest超过 1 年前
Why not just use msgpack? The advantage of JSON is that support is already built in to everything and you don&#x27;t have to think about it.<p>If you start having to actually make an effort to fuss with it, then why not consider other formats?<p>This does have nice backwards compatibility with existing JSON stuff though, and sticking to standards is cool. But msgpack is also pretty nice.
评论 #39332280 未加载
评论 #39332704 未加载
wruza超过 1 年前
Alternatively, jsonl&#x2F;ndjson. The largest parts of jsons are usually arrays, not dictionaries. So you can e.g.:<p><pre><code> {&lt;a header about foos and bars&gt;} {&lt;foo 1&gt;} ... {&lt;foo N&gt;} {&lt;bar 1&gt;} ... {&lt;bar N&gt;} </code></pre> It is compatible with streaming, database json columns, code editors.
xiphias2超过 1 年前
I don&#x27;t really understand what&#x27;s new here compared to what SIMDJSON supported already.<p>Anyways, it&#x27;s the best JSON parser I found (in any language), I implemented fastgron (<a href="https:&#x2F;&#x2F;github.com&#x2F;adamritter&#x2F;fastgron">https:&#x2F;&#x2F;github.com&#x2F;adamritter&#x2F;fastgron</a>) on top of it because of the on demand library performance.<p>One problem with the library was that it needed extra padding at the end of the JSON, so it didn&#x27;t support streaming &#x2F; memory mapping.
评论 #39330771 未加载
评论 #39330674 未加载
bawolff超过 1 年前
Is this different from what everyone was doing with XML back in the day?
评论 #39333566 未加载
pkulak超过 1 年前
This is a real “why didn’t I think of that” moment for sure. So many systems I’ve written have profiled with most of the cpu and allocations in the JSON parser, when all it needs is a few fields. But rewriting it all in SAX is just not worth all the trouble.
评论 #39333936 未加载
评论 #39345351 未加载
jensneuse超过 1 年前
Sounds similar to a technique we&#x27;re using to dynamically aggregate and transform JSON. We call this package &quot;astjson&quot; as we&#x27;re doing operations like &quot;walking&quot; through the JSON or &quot;merging&quot; fields at the AST level. We wrote about the topic and how it helped us to improve the performance of our API gateway written in Go, which makes heavy use of JSON aggregations: <a href="https:&#x2F;&#x2F;wundergraph.com&#x2F;blog&#x2F;astjson_high_performance_json_transformations_in_golang" rel="nofollow">https:&#x2F;&#x2F;wundergraph.com&#x2F;blog&#x2F;astjson_high_performance_json_t...</a>
hwestiii超过 1 年前
On face it, this sounds kind of like the XML::Twig perl module.
SushiHippie超过 1 年前
Related submission from yesterday:<p><a href="https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=39319746">https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=39319746</a> - JSON Parsing: Intel Sapphire Rapids versus AMD Zen 4 - 40 points and 10 comments
pshirshov超过 1 年前
I solved this problem with a custom indexed format: <a href="https:&#x2F;&#x2F;github.com&#x2F;7mind&#x2F;sick">https:&#x2F;&#x2F;github.com&#x2F;7mind&#x2F;sick</a>
Waterluvian超过 1 年前
&gt; The JSON specification has six structural characters (‘[’, ‘{’, ‘]’, ‘}’, ‘:’, ‘,’) to delimit the location and structure of objects and arrays.<p>Wouldn’t a quote “ also be a structural character? It doesn’t actually represent data, it just delimits the beginning and end of a string.<p>I get why I’m probably wrong: a string isn’t a structure of chars because that’s not a type in json. The above six are the pieces of the two collections in JSON.
jesprenj超过 1 年前
Relevant: LEJP - libwebsockets json parser.<p>You specify what you&#x27;re interested in and then the parser calls your callback whenever it reads the part of a large JSON stream that has your key.<p><a href="https:&#x2F;&#x2F;libwebsockets.org&#x2F;lws-api-doc-main&#x2F;html&#x2F;md_READMEs_README_json_lejp.html" rel="nofollow">https:&#x2F;&#x2F;libwebsockets.org&#x2F;lws-api-doc-main&#x2F;html&#x2F;md_READMEs_R...</a>
skibz超过 1 年前
Pretty cool!<p>This reminds me of oboe.js: <a href="https:&#x2F;&#x2F;github.com&#x2F;jimhigson&#x2F;oboe.js">https:&#x2F;&#x2F;github.com&#x2F;jimhigson&#x2F;oboe.js</a>
basil-rash超过 1 年前
&gt; The JSON syntax is nearly a strict subset of the popular programming language JavaScript.<p>What JSON isn’t valid JS?
评论 #39331284 未加载
评论 #39331334 未加载
fanseepawnts超过 1 年前
Sorry, I would never use this. Before I consume any json from any source or for any purpose I validate it. Lazy loading serves no purpose if you need validation.<p>Hint: you need validation.
评论 #39331841 未加载
评论 #39331752 未加载
评论 #39331583 未加载
评论 #39340759 未加载
评论 #39334745 未加载
评论 #39341868 未加载