TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

Introducing TJSON, a stricter, typed form of JSON

140 pointsby basculeover 8 years ago

28 comments

bruthover 8 years ago
All of the keys in JSON must be strings, so they should not need tags for themselves. Instead why not put the tag of the value assigned to the key in the key:<p><pre><code> { &quot;s:string&quot;:&quot;Hello, world!&quot;, &quot;b64:binary&quot;:&quot;SGVsbG8sIHdvcmxk&quot;, &quot;i:integer&quot;:42, &quot;f:float&quot;:42.0, &quot;t:timestamp&quot;:&quot;2016-11-02T02:07:30Z&quot; } </code></pre> This prevents having to mess with the values in general and integers don&#x27;t need to be encoded as strings.<p>EDIT:<p>I see this constraint:<p><pre><code> Member names in TJSON must be distinct. The use of the same member name more than once in the same object is an error. </code></pre> which is still satisfied, however you could have `i:foo` and `s:foo` which would result in redundant keys in the resulting JSON document. This constraint could be clarified that, untagged key names must be unique.<p>Another question, is a mimetype planned for this? `application&#x2F;tjson`?
评论 #12858680 未加载
评论 #12859683 未加载
评论 #12858579 未加载
评论 #12858534 未加载
评论 #12860244 未加载
tiglionabbitover 8 years ago
When have you ever written a program that doesn&#x27;t know ahead of time what type of data it&#x27;s going to be operating on? Especially if you&#x27;re using a statically typed language.<p>Whether you validate incoming payloads in JSONSchema or not, you will always have some understanding of what the shape of the incoming JSON is supposed to be, down to the most concrete types. You&#x27;ll probably receive many JSON payloads that all conform to the same schema. So why bother redundantly describing that schema in every individual payload?<p>If you want strict types, write a JSONSchema. If you need to know specific sub-type information, start specifying what should go into the &quot;format&quot; field in JSONSchema. They did it in Swagger: <a href="http:&#x2F;&#x2F;swagger.io&#x2F;specification&#x2F;" rel="nofollow">http:&#x2F;&#x2F;swagger.io&#x2F;specification&#x2F;</a><p>Since the article complains about JSON parsers not knowing how to handle certain situations, perhaps people should start writing JSON parsers that allow you to pass in a JSONSchema document at parse time so they&#x27;re sure to handle each field type correctly.
评论 #12860257 未加载
评论 #12860246 未加载
评论 #12860172 未加载
bpicoloover 8 years ago
I&#x27;m still waiting on xml with curly braces instead of angle brackets. As far as I can tell that&#x27;s all that&#x27;s holding us back
评论 #12858473 未加载
评论 #12862527 未加载
msoadover 8 years ago
It&#x27;s amazing how many people are trying to reinvent protocol buffers! Every time I see something like this I think the developer didn&#x27;t do their research or maybe they wanted to make a hobby project anyway. Stuff like this is dangerous to use in production. Even JSON as simple as it looks had a lot of bugs that are now.<p>If you want typed data structure transfer, use protocol buffer.
评论 #12858421 未加载
评论 #12858387 未加载
评论 #12858676 未加载
评论 #12860115 未加载
评论 #12859416 未加载
评论 #12859450 未加载
zevebover 8 years ago
&gt; Its primary intended use is in cryptographic authentication contexts, particularly ones where JSON is used as a human-friendly alternative representation of data in a system which otherwise works natively in a binary format.<p>The author might care to take a look at canonical S-expressions, a format from the 90s which attempted to do the same thing for many of the same reasons, and has the advantage of being rather more elegant.<p>E.g:<p><pre><code> { &quot;s:string&quot;:&quot;s:Hello, world!&quot;, &quot;s:binary&quot;:&quot;b64:SGVsbG8sIHdvcmxk&quot;, &quot;s:integer&quot;:&quot;i:42&quot;, &quot;s:float&quot;:42.0, &quot;s:timestamp&quot;:&quot;t:2016-11-02T02:07:30Z&quot; } </code></pre> could be:<p><pre><code> (string &quot;Hello, world!&quot; binary [b]|SGVsbG8sIHdvcmxk| integer [i]&quot;42&quot; float [f]&quot;42.0&quot; timestamp [t]&quot;2016-11-02T02:07:30Z&quot;) </code></pre> Which is a perfectly valid encoding, but can use the canonical encoding (useful for cryptographic hashes):<p><pre><code> (6:string13:Hello, world!6:binary[1:b]13:Hello, world!7:integer[1:i]2:425:float[f]4:42.09:timestamp[1:t]20:2016-11-02T02:07:30Z) </code></pre> Which can be encoded for transport as:<p><pre><code> {KDY6c3RyaW5nMTM6SGVsbG8sIHdvcmxkITY6YmluYXJ5WzE6Yl0xMzpIZWxsbywgd29ybGQhNzpp bnRlZ2VyWzE6aV0yOjQyNTpmbG9hdFtmXTQ6NDIuMDk6dGltZXN0YW1wWzE6dF0yMDoyMDE2LTEx LTAyVDAyOjA3OjMwWik=} </code></pre> Granted, &#x27;elegance&#x27; is in the eye of the beholder, but I like it.<p>I also think that there&#x27;s a deeper concern with any shallow notion of types. An application doesn&#x27;t care so much about &#x27;some integer&#x27; as it does about &#x27;a valid integer for this domain,&#x27; and <i>that</i> concern is what leads to schemas and profiles and things like that. Just encoding the machine type of a value is insufficient: one has to encode the <i>domain</i> type, which means conveying the domain, which means assuming some sort of shared knowledge.
评论 #12860506 未加载
lillesvinover 8 years ago
I feel like there&#x27;s a missed opportunity in not calling it TySON or something like that.<p>That aside, wouldn&#x27;t it make more sense to fix the JSON parsers instead? They are the ones having issues parsing e.g. 64 bit integers, JSON has no problem holding them.
评论 #12858518 未加载
评论 #12858457 未加载
评论 #12861741 未加载
justin_vanwover 8 years ago
wow, this looks awful and painful.<p>There&#x27;s no reason to tag the type of a field when you have a typed syntax. The real problems with JSON aren&#x27;t at all addressed by this:<p>keys have to be strings lack of &#x27;attributes&#x27; like xml, which means you have to make a document convoluted from the start.<p>For example, lets say I am storing product data, I might do it like:<p>{&#x27;title&#x27;: &quot;Billy goes to Buffalo&quot;, &#x27;page_count&#x27;: 193, &#x27;author&#x27;: &quot;Ray Broadbunky&quot;}<p>But later I might want to be able to store attributes or metadata, in xml this doesn&#x27;t change the schema of the document:<p>&lt;product&gt; &lt;title&gt;Billy goes to Buffalo&lt;&#x2F;title&gt; &lt;page_count&gt;193&lt;&#x2F;page_count&gt; &lt;author&gt;Ray Broadbunky&lt;&#x2F;author&gt; &lt;&#x2F;product&gt;<p>Can be extended to:<p>&lt;product&gt; &lt;title human_verified=&quot;false&quot;&gt;Billy goes to Buffalo&lt;&#x2F;title&gt; &lt;page_count human_verified=&quot;true&quot;&gt;193&lt;&#x2F;page_count&gt; &lt;author human_verified=&quot;true&quot;&gt;Ray Broadbunky&lt;&#x2F;author&gt; &lt;&#x2F;product&gt;<p>It&#x27;s not beautiful but anything using this data will not have to change at all to add any metadata like this.<p>However, with JSON you have to either add new data that can somehow be joined to the data originally, or more commonly you have to be very defensive and &#x27;plan for&#x27; this stuff, greatly complicating the schema.<p>You end up starting with: {&#x27;attributes&#x27;: [ {&#x27;name&#x27;:&#x27;title&#x27;,&#x27;value&#x27;:&quot;Billy goes to Buffalo&quot;}, {&#x27;name&#x27;:&#x27;page_count&#x27;, &#x27;value&#x27;:193, ...<p>so that you can add unanticipated things later without breaking consumers of the data<p>but at least some are addressed: no standard way to store bytestrings lack of time type
marianoguerraover 8 years ago
isn&#x27;t there a way to extend the types to specify our own and register constructors for them? like transit?<p>otherwise we will be in the same place of json in terms of extension where our own types are second class citizens.
评论 #12858514 未加载
评论 #12857998 未加载
评论 #12858973 未加载
评论 #12860801 未加载
emmelaichover 8 years ago
Those labels in the example are confusing. Instead of string, binary, integer, float,timestamp please use something like name, password, age, height, sessiontime.<p>Using string and binary is worse than using foo and bar.
Kinnardover 8 years ago
Reminds me of Tyre – Typed regular expressions: <a href="https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=12292389" rel="nofollow">https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=12292389</a>
jasonkostempskiover 8 years ago
&quot;underspecification has lead to a proliferation of interoperability problems and ambiguities.&quot;<p>So TJSON has a perfect spec and everyone, now and forever, will interpret it perfectly?
评论 #12860780 未加载
评论 #12859066 未加载
评论 #12859128 未加载
kevinSuttleover 8 years ago
What about <a href="https:&#x2F;&#x2F;amznlabs.github.io&#x2F;ion-docs&#x2F;" rel="nofollow">https:&#x2F;&#x2F;amznlabs.github.io&#x2F;ion-docs&#x2F;</a> ?
评论 #12858525 未加载
drawkboxover 8 years ago
Why muddy up the actual values where you will have to parse that value with &quot;t:&quot; where t is type?<p>Why stuff it in one key&#x2F;val? why not separated where it looks to see if type is present, if so it converts to it&#x2F;validates against it (you can also place other validations&#x2F;constraints on it like min&#x2F;max values, length etc -- that will fall apart if you are trying to stuff it all in one key&#x2F;value).<p>Like this:<p><pre><code> { &quot;val&quot;:&quot;Hello, world!&quot;, &quot;type&quot;:&quot;string&quot;, &quot;validation&quot;: &quot;[regex]&quot; } </code></pre> Instead of:<p><pre><code> { &quot;s:string&quot;:&quot;s:Hello, world!&quot; } </code></pre> This is typically how we type fields in JSON when needed as there is no parsing needed on the value. If you need to check type and it is present you can act on it.
评论 #12858089 未加载
评论 #12858329 未加载
评论 #12858497 未加载
评论 #12858475 未加载
shitgooseover 8 years ago
json became so popular in first place because of its simplicity, i.e. no schemas, namespaces, attributes, less bizarre notation than xml. let&#x27;s keep it this way.
评论 #12860295 未加载
mitchtbaumover 8 years ago
This looks similar to msgpack with saltpack for crypto parts. Right?<p><a href="http:&#x2F;&#x2F;msgpack.org&#x2F;" rel="nofollow">http:&#x2F;&#x2F;msgpack.org&#x2F;</a><p><a href="https:&#x2F;&#x2F;saltpack.org&#x2F;" rel="nofollow">https:&#x2F;&#x2F;saltpack.org&#x2F;</a>
colandermanover 8 years ago
Six things:<p>1) &quot;Lack of full precision 64-bit integers&quot; is bullshit. Numeric precision is not specified by JSON. If a parser can&#x27;t deal with 64-bit integer values, it&#x27;s a poor parser.<p>2) &quot;s: UTF-8 string&quot; What does this mean? JSON strings are strings of Unicode code points; JSON itself may be encoded as UTF-8, -16, or -32. So does this mean &quot;encode the string as UTF-8, then represent as Unicode code points&quot;? That makes no sense.<p>Does this mean &quot;encode the string as UTF-8 and output directly regardless of the encoding of the rest of the JSON output&quot;? That makes no sense either.<p>So I&#x27;m guessing the author just conflated &quot;UTF-8&quot; with &quot;Unicode&quot;, which is concerning given that he is attempting to define an interchange protocol.<p>3) &quot;i: signed integer (base 10, 64-bit range)&quot; What does this mean? (-2^64,2^64)? (-2^63,2^63)? [-2^63,2^63)?<p>4) &quot;t: timestamp (Z-normalized)&quot; What does that mean? There are literally dozens of timestamp formats. Does he mean full ISO 8601, restricted to UTC?<p>5) What is the point of TJSON anyway? When you deserialize, you <i>still</i> have to check that the data is of the type you expect. At best this saves a bit of parsing, since the deserializer can do that automatically. Various JSON schema languages already exist, which give you this richer typechecking.<p>The only use case I can think of for this is exactly what the author mentions further down the article: canonicalization for content-aware hashing. But this only works if the only types you care about fall into the small handful he thought of. What about, say, IP addresses? Case-insensitive strings (such as e-mail addresses)?<p>6) If we&#x27;re talking about canonicalization, TJSON does not say how to canonicalize decimal numbers. I suppose this stems from the author&#x27;s mistaken belief that numbers in JSON are IEEE floats (they&#x27;re not, regardless of what common broken parsers do).<p>I hate to be so negative, but this really comes off as half-baked.<p>EDIT: Looking at the spec [1] it seems to address <i>some</i> of these, but still indicates a strong confusion between data <i>types</i> (Unicode, rational numeric) and data <i>representations</i> (UTF-8, IEEE double).<p>[1] <a href="https:&#x2F;&#x2F;github.com&#x2F;tjson&#x2F;tjson-spec&#x2F;blob&#x2F;master&#x2F;draft-tjson-spec.md" rel="nofollow">https:&#x2F;&#x2F;github.com&#x2F;tjson&#x2F;tjson-spec&#x2F;blob&#x2F;master&#x2F;draft-tjson-...</a>
评论 #12860143 未加载
评论 #12860110 未加载
kr0over 8 years ago
Why don&#x27;t float types use a tagged string? It says &quot;tagging is mandatory&quot; in the initial document, but floating point types are then omitted in the official spec
评论 #12858073 未加载
评论 #12858276 未加载
评论 #12858354 未加载
评论 #12860814 未加载
amorphidover 8 years ago
I&#x27;ve been writing a JSON parser when I have a few minutes here and there. I was surprised by the lack of specificity in defining numbers, specifically floats. If floats are know to lose precision after a few decimal places...<p>iex&gt; 1.5555555555555555<p>1.5555555555555556<p>...why not just specify a max precision? You can always say &quot;if you need a more precise number, just store it as a string&quot;. If I wanted a room for interpretation, I&#x27;d use YAML!
rurbanover 8 years ago
This argumentation is complete bullshit and even dangerous.<p>&gt; &quot;Parsing JSON is a Minefield&quot;: From a strictly software engineering perspective these ambiguities can lead to annoying bugs and reliability problems, but in a security context such as JOSE they can be fodder for attackers to exploit. It really feels like JSON could use a well-defined “strict mode”.<p>Not at all. This article just outlined the differences of the various implementations regarding the 2 specs. And then added a spec test suite, including all the undefined problems, with suggestions how to go forward.<p>JSON is already strict enough. The problem are people like op to make it even not-stricter. The latest JSON spec RFC 7159 adds ambiguity by allowing all scalar values on the top level, which leads to practical exploitability. See e.g. <a href="https:&#x2F;&#x2F;metacpan.org&#x2F;pod&#x2F;Cpanel::JSON::XS#OLD-VS.-NEW-JSON-RFC-4627-VS.-RFC-7159" rel="nofollow">https:&#x2F;&#x2F;metacpan.org&#x2F;pod&#x2F;Cpanel::JSON::XS#OLD-VS.-NEW-JSON-R...</a><p>&quot;For example, imagine you have two banks communicating, and on one side, the JSON coder gets upgraded. Two messages, such as 10 and 1000 might then be confused to mean 101000, something that couldn&#x27;t happen in the original JSON, because neither of these messages would be valid JSON.<p>If one side accepts these messages, then an upgrade in the coder on either side could result in this becoming exploitable.&quot;<p>What the op now suggests is adding the insecurity-mistake YAML took by adding tags to all keys. Here types don&#x27;t add security, they weaken security!<p>It is security nightmare as it is leading to exploits which are e.g. already added to metasploit (CVE-2015-1592). tagged decoders are always a problem, and currently JSON and msgpack are the only serializers safe from such exploits due to its strictness.<p>I would suggest that the remaining JSON libraries first fix their problems by conforming to the specs. First the secure old variant (RFC 4627) as default, and then maybe the relaxed new RFC 7159 variant, but denoting the security problems with interop of scalar values.<p>Currently only my Cpanel::JSON::XS library pass all these tests from the Minefield article. E.g. the ruby one, which the author complains about, not. The type problem is esp. problematic in dynamic languages like ruby, where classes are not finalized by default.
DiabloD3over 8 years ago
So, why would I use this instead of actual JSON (== browser support), BSON (binary JSON), or Capn Proto (I control both ends of this)?
romanovcodeover 8 years ago
I&#x27;d rather use XML than this atrocity.
mnarayan01over 8 years ago
&gt; All base64url strings in TJSON MUST NOT include any padding with the &#x27;=&#x27; character.<p>This seems like it makes a streaming parser&#x27;s job (slightly) more of a headache, without any serious advantage. Which seems particularly odd to me given that this seems heavily focused on binary stuff.
评论 #12860795 未加载
gengkevover 8 years ago
I&#x27;m a bit confused that TJSON only allows UTF-8 strings. The only way to escape Unicode characters in JSON is \uXXXX. But to encode astral characters with this syntax, UTF-16 surrogate pairs must be used. How does TJSON handle this, if strings must be encoded with UTF-8 only?
评论 #12871268 未加载
jasoncchildover 8 years ago
Does a time zone key trigger the enforcement a specific ISO standard format for the value?
rxbudianover 8 years ago
Why not just have a separate metadata file. It will keep the json file lean.
novaleafover 8 years ago
and still no ability to have comments. one reason I strongly prefer JSON5 <a href="http:&#x2F;&#x2F;json5.org&#x2F;" rel="nofollow">http:&#x2F;&#x2F;json5.org&#x2F;</a>
partycoderover 8 years ago
Can you have a typed array too?
评论 #12860313 未加载
jaimex2over 8 years ago
This is literally protobuffs.
评论 #12861868 未加载