Wire Protocols and APIs

Mar 6, 2022

The majority of data that transfers over the network at companies like Google and Uber isn't encoded as JSON and don't use REST APIs. Instead, the messages are encoded as protocol buffers over RPC APIs. Why this is most likely the future and what are the implications?

Why?

  • JSON is a great format for human readable messages. But what's human readable is often much slower to serialize. Depending on your benchmark, protobufs about 5x faster than JSON.
  • JSON is a schema-less message format. Protobufs have a typed schema. That means they can be type-checked, but also optimized and bin-packed. Clients and server stubs can be generated from the schema.

Implications

Protobufs are often used with RPC APIs instead of REST APIs. To my knowledge, there's no specific reason why this is, you can absolutely serialize REST API messages with JSON. My guess is (1) RPC is used internally at Google, and Google invented protobufs, so naturally the two work well together and (2) protobufs already have a schema, and therefore client/server can be generated.

Protobufs are also more difficult to work with and debug. Code generation and typing can add an extra layer of complexity for projects that JSON doesn't. Protobufs aren't a good fit for configuration either – they are strictly good for moving data over the wire.