They added a feature that impressively fails to interoperate with the rest of the world.<p>> Added well-known type protos (any.proto, empty.proto, timestamp.proto, duration.proto, etc.). Users can import and use these protos just like regular proto files. Additional runtime support are available for each language.<p>From timestamp.proto:<p><pre><code> // A Timestamp represents a point in time independent of any time zone
// or calendar, represented as seconds and fractions of seconds at
// nanosecond resolution in UTC Epoch time. It is encoded using the
// Proleptic Gregorian Calendar which extends the Gregorian calendar
// backwards to year one. It is encoded assuming all minutes are 60
// seconds long, i.e. leap seconds are "smeared" so that no leap second
// table is needed for interpretation.
</code></pre>
Nice, sort of -- all UTC times are representable. But you can't <i>display</i> the time in normal human-readable form without a leap-second table, and even their sample code is wrong is almost all cases:<p><pre><code> // struct timeval tv;
// gettimeofday(&tv, NULL);
//
// Timestamp timestamp;
// timestamp.set_seconds(tv.tv_sec);
// timestamp.set_nanos(tv.tv_usec * 1000);
</code></pre>
That's only right if you run your computer in Google time. And, damn it, Google time leaked out into public NTP the last time their was a leap second, breaking all kinds of things.<p>Sticking one's head in the sand and pretending there are no leap seconds is one thing, but designing a protocol that breaks interoperability with people who <i>don't</i> bury their heads in the sand is another thing entirely.<p>Edit: fixed formatting
- removing optional values is actually quite nice. In practice, I end up checking for "missing or empty string" anyway.<p>- the "well-known types" boxed primitive types essentially add optional values back in. And depending on your language bindings, may look the same.<p>- extensions are still allowed in proto3 syntax files, but only for options - since the descriptor is still proto2. It seems odd to build a proto3 that couldn't represent descriptors.<p>- I still don't understand the removal of unknown fields. Reserialization of unknown fields was always the <i>first</i> defining characteristic of protobufs I described to people. I actually read many of the design/discussion docs internally when I worked at Google, and I still couldn't figure this one out. Although it's certainly simpler…<p>- Protobufs are the "lifeblood" (Rob Pike's words) of Google: the protobuf team is working to get rid of significant Lovecraftian internal cruft, after which their ability to incorporate open source contributions should improve dramatically.
How does this compare or in general why would you pick this vs newer formats like Cap'n'proto or FlatBuffers?<p>From FlatBuffers overview I see this comparison:<p>---<p>Protocol Buffers is indeed relatively similar to FlatBuffers, with the primary difference being that FlatBuffers does not need a parsing/ unpacking step to a secondary representation before you can access data, often coupled with per-object memory allocation. The code is an order of magnitude bigger, too. Protocol Buffers has neither optional text import/export nor schema language features like unions.<p>---<p>So are the newer ones useful mostly when serialization vs deserialization speed matters (<a href="https://google.github.io/flatbuffers/" rel="nofollow">https://google.github.io/flatbuffers/</a>) ?
This looks like a nice evolution.<p>It's a pity that the "deterministic serialization" gives so few guarantees; I have worked on at least one project that really needed this.<p>(Basically, we wanted to parse a signed blob, do some work, and pass the original data on without breaking the signature; unfortunately, this requires keeping the serialized form around, since the serialized form cannot be re-generated from its parsed format.)
"The main intent of introducing proto3 is to clean up protobuf before pushing
the language as the foundation of Google's new API platform"<p>Does anyone know if this means Google's public APIs will be proto3 based? I quite like protobufs.
Shocking! Google's started supporting more languages than just the ones they care about. I really hope this signals the death of their disdain culture.<p>Being a worthwhile Cloud provider means hiring experts in all sorts of languages and supporting their efforts.<p>Imagine a world where Google didnt just "support node" (YEARS late), but actually turned their v8 expertise into a Cloud product.<p>But that'd involve convincing Java-devs-turned-VPs to care about JavaScript, <2004>and EVERYONE knows that JavaScript is a terrible language.</2004>
Sadly the JSON format they chose isn't actually suitable for high-performance web apps. Web developers who use protobufs will continue to get by with various nonstandard JSON encodings.
Google also has flatbuffers. I wonder if flatbuffers is being used by enough developers to justify significant development?<p><a href="https://github.com/google/flatbuffers" rel="nofollow">https://github.com/google/flatbuffers</a>
> primitive fields set to default values (0 for numeric fields, empty for string/bytes fields) will be skipped during serialization.<p>I don't totally understand this. Presumably during deserialization they will be set to defaults and not missing? Otherwise, coupled with the removal of required fields, it seems impossible to actually send a 0-value number or empty string, or to send a proto without a field and not have it set to 0 or "" (have to explicitly null the field?).
I was hoping for packed serialization of non-primitive types. I once used Protobuf to serialize small point clouds, and ended up needing to serialize them as a packed double array and reconstruct the (x, y, z) structure at read time to avoid Protobuf malloc'ing each point individually. Not a huge deal, but it would be a real pain for more complex types.
Could someone explain to me why you would use Protocol Buffers, Cap'n Proto, etc versus rolling your own type-length-value protocol besides API interop?<p>What if your team could write a smaller TLV protocol, and it was necessary to keep your codebase small? Would this not be wise? Are Protobufs and party not comparable to TLV protocols?