TechEcho

8 comments

spenczar5over 3 years ago

This seems pretty confused. The "compiled vs dynamic" distinction is a property of the implementation, not of the protocol.For example, you can certainly compile Avro into Go source files [0]. You can even compile Avro loaded schemas _during runtime_ into Python bytecode, since Python is interpreted [1]. This even works if you have the _wrong schema document_ for the message (you'll just get the subset of fields which are accurately described), because of Avro's schema compatibility rules.Likewise, you can deserialize arbitrary protobuf messages during runtime without a compilation step, if you have a description for the message schema. The Python protobuf library has had a "ParseMessage" API forever, and protoreflect [2] exists for Go. (In case it's not obvious, I mostly work in Python and Go but I am completely certain analogues exist in other major languages).There is a very big and important difference between a protocol and the implementation of a protocol. I think this README's author is not clear on that difference, which shows up in other claims ("Deserialization is incrimental", for example) too.---[0] <a href="https://github.com/actgardner/gogen-avro" rel="nofollow">https://github.com/actgardner/gogen-avro</a>[1] <a href="https://github.com/spenczar/avroc" rel="nofollow">https://github.com/spenczar/avroc</a>[2] <a href="https://pkg.go.dev/google.golang.org/protobuf/reflect/protoreflect" rel="nofollow">https://pkg.go.dev/google.golang.org/protobuf/reflect/protor...</a>

评论 #28411526 未加载

vlovich123over 3 years ago

Does anyone know what the `mutable` column means? I think the cap'n'proto section may not be correct as I think you can do read/modify/write.> all the buffers/data is trustedThis part is also wrong I think. Cap'n'proto is fine to use with untrusted data AFAIK.> Thus, Cap’n Proto checks the structural integrity of the message just like any other serialization protocol would. And, just like any other protocol, it is up to the app to check the validity of the content.Likely the author is referring to this section of the home page:> As of this writing, Cap’n Proto has not undergone a security review, therefore we suggest caution when handling messages from untrusted sourcesWhich seems more like a disclaimer than "you can't do this" (security is on by default as compared with FlatBuffers where it's opt-in).

评论 #28411041 未加载

erik_seabergover 3 years ago

Being able to quickly replace or append a field without re-serializing the entire struct can be useful, but the overhead for it is pretty substantial (an extra u32 for every field, plus 1/4 of another u32 for next vtable in the list). Every u32 always consumes four (unaligned?) bytes. I think “far more space efficient” than Avro is pretty unlikely.I hope he’s also looking at ASN.1 PER for more “don’t write anything the reader already knows” ideas in the vein of Avro.Without OIDs or UUIDs for versions of types, most of these formats seem equally prone to treating invalid messages as unpredictable gibberish or erroring out on an invalid size or offset.

pmalyninover 3 years ago

Tbh it’s kind of funny folks keep reinventing the wheel when we have had portable, standardized serialization format for around 50 years: DER.

评论 #28412802 未加载

评论 #28414456 未加载

Groxxover 3 years ago

Every popular RPC format I'm aware of supports dynamic schemas, they're just generally not used unless strictly necessary. E.g. Cap'n Proto has SchemaLoader (dynamically loading compiled schemas) and SchemaParser (parsing schema strings into generic objects): <a href="https://capnproto.org/cxx.html#dynamic-reflection" rel="nofollow">https://capnproto.org/cxx.html#dynamic-reflection</a>, protobuf has "Dynamic Message": <a href="https://developers.google.com/protocol-buffers/docs/reference/cpp/google.protobuf.dynamic_message" rel="nofollow">https://developers.google.com/protocol-buffers/docs/referenc...</a> , etc. Plus, literally anything you can compile can be done dynamically, so it's not like this is an inherent quality of an encoding, only at best a specific tool.Broadly: I'm sorta curious about the details here, but I'll have to read more later. It seems to be claiming the world without a whole lot of evidence, and some incorrect claims, while being implemented as a fairly simple (often good!) serialized form which is also quite large (less good!). Also I have no idea why it's so obsessed with sort-ability. It's not like the encoded data as a whole can be meaningfully sorted, and per-value that sounds like "is big-endian"...E.g. this is the serialized form of a pre-defined struct with a single field, like `{"age":20}`: <a href="https://docs.rs/no_proto/0.9.60/no_proto/format/index.html" rel="nofollow">https://docs.rs/no_proto/0.9.60/no_proto/format/index.html</a><pre><code> // [0, 0, 0, 0, 0, 6, 0, 0, 0, 26, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 20] // [ root ptr, vtable, data] </code></pre> which is pretty far from protocol buffers': <a href="https://developers.google.com/protocol-buffers/docs/encoding" rel="nofollow">https://developers.google.com/protocol-buffers/docs/encoding</a><pre><code> 08 96 20 ^-- 01 in sample </code></pre> This sort of waste in favor of simplicity seems to be spread throughout the encoding. I'll happily admit that I think NoProto's format is the easier of the two to read and understand by hand, but that's not generally why we choose binary encodings. It's possible the simplicity gives it a performance edge, but I strongly suspect that, in practice, that's influenced far more by the various implementations' internal details than the encoding itself.

gravypodover 3 years ago

One of the big benefits of protos is that you have a language-agnostic schema that can evolve safely and backwards compatibly.I'd love for some team to join that with better performance as it seems like it's possible from what this project has done.

tomberekover 3 years ago

At what point do you throw in the towel and just send in-memory representations of SQLite DBs around?

omegalulwover 3 years ago

How is serialization and de-serialization faster than compiled formats with just spend? If you update many if the fields some more than once, the overhead would be higher?

8 comments

spenczar5over 3 years ago

评论 #28411526 未加载

vlovich123over 3 years ago

评论 #28411041 未加载

erik_seabergover 3 years ago

pmalyninover 3 years ago

Tbh it’s kind of funny folks keep reinventing the wheel when we have had portable, standardized serialization format for around 50 years: DER.

评论 #28412802 未加载

评论 #28414456 未加载

Groxxover 3 years ago

gravypodover 3 years ago

tomberekover 3 years ago

At what point do you throw in the towel and just send in-memory representations of SQLite DBs around?

omegalulwover 3 years ago

How is serialization and de-serialization faster than compiled formats with just spend? If you update many if the fields some more than once, the overhead would be higher?

NoProto: Flexible, Fast and Compact Serialization with RPC

8 comments

NoProto: Flexible, Fast and Compact Serialization with RPC

8 comments