TechEcho

16 comments

thehappypmover 4 years ago

“ Naming a project is always difficult. Since this project is focused on representing changes between JSON documents I naturally started thinking about names like "JSON differ, JSON changes, JSON patch, …". However, most of these names have already been used by existing projects. While I was muttering "JSON, JSON, JSON" it suddenly turned into "JSON, JSON, Jason, Jason Mendoza".”This is a reference from The Good Place and it is an amazing name!

feanaroover 4 years ago

This is an interesting tool for computing minimal diffs, but the result is not very human friendly. If this is your goal and you are looking for something better than diff, have a look at graphtage: <a href="https://github.com/trailofbits/graphtage" rel="nofollow">https://github.com/trailofbits/graphtage</a>Also works for XML, HTML, YAML and CSV.

评论 #24947272 未加载

评论 #24947523 未加载

评论 #24949729 未加载

anonymousblipover 4 years ago

Interesting approach! But aren't JSON arrays a pretty wasteful encoding?Since this is an opaque serialization of an instruction set, why not try to encode more bits per number (JSON floats support lossless integers of many more bits), and moving the "symbol table" (string data) to the end?This way you could also compress redundant symbols into single strings.

评论 #24945183 未加载

评论 #24945136 未加载

Arquover 4 years ago

I'm super interested in this topic. Recently (and still ongoing) I started on hashing out how to diff large datasets and what that even means.I would love to get an understanding of how the HN crowd sees diffing datasets should be (lets say >1GB in size).Are you more interested in a "patch" quality diff of the data which is more machine tailored? Or is a change report/summary/highlights more interesting in that case?Currently I'm leaning more towards the understanding/human consumption perspective which offers some interesting tradeoffs.

评论 #24946363 未加载

TheRealPomaxover 4 years ago

Out of curiosity, what problem did you have that this approach solves?Thinking about diffs, plain text diffs are typically compressed for transport anyway, so you end up with something that's human readable at the point of generation and application (where the storage/processing cost associated with legibility is basically insignificant) while being highly compressed during transport (where legibility is irrelevant and no processing beyond copying bits is necessary).

评论 #24945935 未加载

tomswartz07over 4 years ago

I appreciate the Good Place reference in the name.

评论 #24945615 未加载

评论 #24945341 未加载

评论 #24945348 未加载

remexreover 4 years ago

Have you looked at generating a specialized version of this stack machine for other types than JSON values? It'd be neat to have a version that works directly on data with Haskell's GHC.Generics, or a custom derive in Rust.

remexreover 4 years ago

Why does this contain an implementation of sha256 instead of using the standard library's crypto/sha256?

评论 #24945762 未加载

mjgp2over 4 years ago

Isn't this what <a href="https://tools.ietf.org/html/rfc6902" rel="nofollow">https://tools.ietf.org/html/rfc6902</a> is for?

narrationboxover 4 years ago

Add a few more lines of code and you have yourself an operational transformation library for collaborative data structures.You may find this relevant: <a href="https://github.com/ottypes/json1" rel="nofollow">https://github.com/ottypes/json1</a>

zokierover 4 years ago

Somehow I'm reminded how patch runs ed beneath and I guess could use more advanced editing commands: <a href="https://news.ycombinator.com/item?id=16767509" rel="nofollow">https://news.ycombinator.com/item?id=16767509</a>Also is there enough here to build a turing machine? I guess not, but it does seem pretty close.

tonygover 4 years ago

How does it decide whether two JSON values are equal? e.g. 1 vs 1.00 vs 1e0. (<a href="https://preserves.gitlab.io/preserves/why-not-json.html" rel="nofollow">https://preserves.gitlab.io/preserves/why-not-json.html</a>)

georgewsingerover 4 years ago

<a href="https://www.youtube.com/watch?v=MOk4hQXbGDs&ab_channel=ThingsICantFindOtherwise" rel="nofollow">https://www.youtube.com/watch?v=MOk4hQXbGDs&ab_channel=Thing...</a>

dreamer7over 4 years ago

On a tangential topic, how does one go about integrating Sanity Studio with Hugo so non-developers can create content more quickly?

评论 #24944926 未加载

ameliusover 4 years ago

> Mendoza constructs a minimal recipe for transforming a document into another.Is it really minimal, or is it an attempt at minimal?

vicbergquistover 4 years ago

Cool!

16 comments

thehappypmover 4 years ago

feanaroover 4 years ago

评论 #24947272 未加载

评论 #24947523 未加载

评论 #24949729 未加载

anonymousblipover 4 years ago

评论 #24945183 未加载

评论 #24945136 未加载

Arquover 4 years ago

评论 #24946363 未加载

TheRealPomaxover 4 years ago

评论 #24945935 未加载

tomswartz07over 4 years ago

I appreciate the Good Place reference in the name.

评论 #24945615 未加载

评论 #24945341 未加载

评论 #24945348 未加载

remexreover 4 years ago

Why does this contain an implementation of sha256 instead of using the standard library's crypto/sha256?

评论 #24945762 未加载

mjgp2over 4 years ago

Isn't this what <a href="https://tools.ietf.org/html/rfc6902" rel="nofollow">https://tools.ietf.org/html/rfc6902</a> is for?

narrationboxover 4 years ago

zokierover 4 years ago

tonygover 4 years ago

georgewsingerover 4 years ago

<a href="https://www.youtube.com/watch?v=MOk4hQXbGDs&ab_channel=ThingsICantFindOtherwise" rel="nofollow">https://www.youtube.com/watch?v=MOk4hQXbGDs&ab_channel=Thing...</a>

dreamer7over 4 years ago

On a tangential topic, how does one go about integrating Sanity Studio with Hugo so non-developers can create content more quickly?

评论 #24944926 未加载

ameliusover 4 years ago

> Mendoza constructs a minimal recipe for transforming a document into another.Is it really minimal, or is it an attempt at minimal?

vicbergquistover 4 years ago

Cool!

Mendoza: Use stack machines to compute efficient JSON diffs

16 comments

Mendoza: Use stack machines to compute efficient JSON diffs

16 comments