It's Just a Tarball

Sep 20, 2022

Sometimes complex software is simple when you go a few layers down.

For example, take the container image. There's so much complexity around building, deploying, and managing containers at scale. Yet, container images are just tarballs. With a few metadata files, you could quickly build one without any special tooling. In an unprivileged environment, in code, or even by hand.

Or git's object model. Git is known for its terrible UX, so sometimes, we assume that everything under the porcelain is also complex. Yet Git's object model is pretty simple – content-addressed blobs (file-like), trees (folder-like), and commits that get stored in a .git/objects folder.

Some git commands are still just shell scripts under the hood – e.g., git subtree. (However, many of them have slowly been converted to builtins written in C over the years.).

Not to mention plaintext protocols. HTTP, SMTP, FTP, and Redis Serialization Protocol (RESP) are a few examples.

Maybe one caveat is that nascent technology is often unnecessarily complex. Things are just getting pieced together – unoptimized workflows, artifacts leftover from failed experiments,

Richard Feynman, the late Nobel Laureate in physics, was once asked by a Caltech faculty member to explain why spin one-half particles obey Fermi Dirac statistics. Rising to the challenge, he said, "I'll prepare a freshman lecture on it." But a few days later he told the faculty member, "You know, I couldn't do it. I couldn't reduce it to the freshman level. That means we really don't understand it."