TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

Json vs. simplejson vs. ujson

125 点作者 harshulj大约 10 年前

23 条评论

jmoiron大约 10 年前
When I wrote the same kind of article in Nov 2011 [1], I came to similar conculsions; ujson was blowing everyone away.<p>However, after swapping a fairly large and json-intensive production spider over to ujson, we noticed a large increase in memory use.<p>When I investigated, I discovered that simplejson reused allocated string objects, so when parsing&#x2F;loading you basically got string compression for repeated string keys.<p>The effects were pretty large for our dataset, which was all API results from various popular websites and featured lots of lists of things with repeating keys; on a lot of large documents, the loaded mem object was sometimes 100M for ujson and 50M for simplejson. We ended up switching back because of this.<p>[1] <a href="http:&#x2F;&#x2F;jmoiron.net&#x2F;blog&#x2F;python-serialization&#x2F;" rel="nofollow">http:&#x2F;&#x2F;jmoiron.net&#x2F;blog&#x2F;python-serialization&#x2F;</a>
评论 #9326841 未加载
评论 #9327093 未加载
borman大约 10 年前
The problem with all (widely known) the non-standard JSON packages is, they all have their gotchas.<p>cjson&#x27;s way of handling unicode is just plain wrong: it uses utf-8 bytes as unicode code points. ujson cannot handle large numbers (somewhat larger than 2<i></i>63, i&#x27;ve seen a service that encodes unsigned 64-bit hash values in JSON this way: ujson fails to parse its payloads). With simplejson (when using speedups module), string&#x27;s type depends on its value, i.e. it decodes strings as &#x27;str&#x27; type if their characters are ascii-only, but as &#x27;unicode&#x27; otherwise; strangely enough, it always decodes strings as unicode (like standard json module) when speedups are disables.
评论 #9329969 未加载
评论 #9328813 未加载
Drdrdrq大约 10 年前
I disagree with the conclusion. How about this: you should use the tool that most of your coworkers already know and which has large community support and adequate performance. In other words, stop foling around and use json library. If (IF!!!) you find performance inadequate, try the other libraries. And most of all, if optimization is your goal: measure, measure and measure! &lt;&#x2F;rant&gt;
jbergstroem大约 10 年前
I just want to add another library in here which – at least in my world – is replacing json as the number one configuration and serialisation format. It&#x27;s called libucl and it&#x27;s main consumer is probably the new package tool in FreeBSD: `pkg`<p>Its syntax is nginx-like but can also parse strict json. It&#x27;s pretty fast too.<p>More info here: <a href="https:&#x2F;&#x2F;github.com&#x2F;vstakhov&#x2F;libucl" rel="nofollow">https:&#x2F;&#x2F;github.com&#x2F;vstakhov&#x2F;libucl</a>
评论 #9328252 未加载
评论 #9327575 未加载
评论 #9327247 未加载
评论 #9328088 未加载
wodenokoto大约 10 年前
How hard is it to draw a bar graph? I&#x27;d imagine it is easier than creating an ASCII table and then turning that into an image, but I&#x27;ve never experimented with the latter.
评论 #9327217 未加载
评论 #9327075 未加载
评论 #9327303 未加载
评论 #9326946 未加载
chojeen大约 10 年前
Maybe this is a dumb question, but is json (de)serialization really a bottleneck for python web apps in the real world?
评论 #9328162 未加载
评论 #9328181 未加载
评论 #9330434 未加载
michaelmior大约 10 年前
&gt; ultrajson ... will not work for un-serializable collections<p>So I can&#x27;t serialize things with ultrajson that aren&#x27;t serializable? I must be missing something in this statement.<p>&gt; The verdict is pretty clear. Use simplejson instead of stock json in any case...<p>The verdict seems clear (based solely on the data in the post) that ultrajson is the winner.
评论 #9329085 未加载
评论 #9328163 未加载
jroseattle大约 10 年前
&gt; keep in mind that ultrajson only works with well defined collections and will not work for un-serializable collections. But if you are dealing with texts, this should not be a problem.<p>Well-defined collections? As in, serializable? Well sure, that&#x27;s requisite for the native json package as well as simplejson (as far as I can recall -- haven&#x27;t used simplejson in some time.)<p>But does &quot;texts&quot; refer to strings? As in, only one data type? The source code certainly supports other types, so I wonder what this statement refers to.
评论 #9326948 未加载
foota大约 10 年前
I disagree with the verdict at the end of the article, it seems like json would be better if you were doing a lot of dumping? And also for the added maintenance guarantee of being an official package.
jkire大约 10 年前
&gt; We have a dictionary with 3 keys <p>What about larger dictionaries? With such a small one I would be worried that a significant proportion of the time would be simple overhead.<p>[Warning: Anecdote] When we were testing out the various JSON libraries we found simplejson much faster than json for dumps. We used <i>large</i> dictionaries.<p>Was the simplejson package using its optimized C library?
评论 #9327221 未加载
ktzar大约 10 年前
The usage of percentages in the article is wrong. 6 is not 150% faster than 4.
评论 #9326802 未加载
评论 #9326831 未加载
评论 #9326732 未加载
评论 #9327302 未加载
stared大约 10 年前
But ujson comes at a price of slightly reduced functionality. For example, you cannot set indent. (And I typically set indent for files &lt;100MB, when working with third-party data, often manual inspection is necessary).<p>(BTW: I got tempted to try ujson exactly for the original blog post, i.e. <a href="http:&#x2F;&#x2F;blog.dataweave.in&#x2F;post&#x2F;87589606893&#x2F;json-vs-simplejson-vs-ultrajson." rel="nofollow">http:&#x2F;&#x2F;blog.dataweave.in&#x2F;post&#x2F;87589606893&#x2F;json-vs-simplejson...</a>)<p>Plus, AFAIK, at least in Python 3 json IS simplejson (but a few version older). So every comparison of these libraries is going to give different results over time (likely, with difference getting smaller). Of course, simpejson is the newer thing of the same, so it&#x27;s likely to be better.
willvarfar大约 10 年前
(My own due diligence when working with serialisation: <a href="http:&#x2F;&#x2F;stackoverflow.com&#x2F;questions&#x2F;9884080&#x2F;fastest-packing-of-data-in-python-and-java" rel="nofollow">http:&#x2F;&#x2F;stackoverflow.com&#x2F;questions&#x2F;9884080&#x2F;fastest-packing-o...</a><p>I leave this here in case it helps others.<p>We had other focus such as good for both python and java.<p>At the time we went msgpack. As msgpack is doing much the same work as json, it just shows that the magic is in the code not the format..)
apu大约 10 年前
Also weird crashes with ultra json, lack of nice formatting in outputs, and high memory usage in some situations
dbenhur大约 10 年前
&gt; Without argument, one of the most common used data model is JSON<p>JSON is a data representation, not a data model.
js2大约 10 年前
I&#x27;ll have to try ultrajson for my use case, but when I benchmarked pickle, simplejson and msgpack, msgpack came out the fastest. I also tried combining all three formats with gzip, but that did not help. Primarily I care about speed when deserializing from disk.
velox_io大约 10 年前
I know it goes against the grain, but I wish that binary json (UBJSON) had much more widespread usage. There&#x27;s no reason tools can&#x27;t convert it back to json for us old humans.<p>The speed deference between working with binary streams and parsing text is night and day.
评论 #9327674 未加载
akoumjian大约 10 年前
We took a look at ujson about a year ago and found that it failed loading even json structures that went 3 layers deep. I also recall issues handling unicode data.<p>It was a big disappointment after seeing these kinds of performance improvements.
MagicWishMonkey大约 10 年前
It kills me that the default JSON module is <i>so</i> slow, if you&#x27;re working with large JSON objects you really have no choice but to use a 3rd party module because the default won&#x27;t cut it.
bpicolo大约 10 年前
Python version? Library version? Results are meaningless without that info
fijal大约 10 年前
The standard JSON has an optimized version in PyPy (that does not beat ujson, but is a lot faster than the stdlib one in cpython)
UUMMUU大约 10 年前
was aware of simplejson but had not seen ultra json. This is awesome to see. Thanks for the writeup.
aaronem大约 10 年前
*(Python)