Merkle Trees

Apr 25, 2022

A Merkle tree is a tree of interconnected hashes. When one leaf node changes, the hash of each parent up the chain also changes, and ultimately, the root hash changes.

Peer-to-peer networks often use Merkle trees. It allows each peer to efficiently ensure that no data was lost or modified in transit. Receivers can verify small chunks of data when they are sent by checking them against a small set of hashes. The complete data set isn't needed for this verification.

Copy-on-write filesystems like btrfs and ZFS use Merkle trees to verify the filesystem's integrity. In the Nix package manager, the build rules and direct dependencies of a package make up a node of a Merkle tree. When Git syncs or calculates diffs, it's checking against a Merkle tree of objects that it computes and stores.

Certificate Transparency (CT) logs also use Merkle trees. When some Certificate Authorities issue certificates (e.g., for HTTPS), they are written to a public, verifiable, append-only log. The Merkle tree-powered log doesn't stop wrongly issued certificates but makes them discoverable.

Finally, Merkle trees are used in blockchains like Bitcoin and Ethereum to verify that a transaction was included in a particular block without transferring the whole block's contents over the network.