Compression / Learning Duality

Sep 30, 2023

Compression algorithms encode information in efficient ways. It’s what makes a zip file smaller than the sum of its parts, a mp3 smaller than a studio recording FLAC, or a JPEG image smaller than a RAW photo. Compression can be lossy (irreversible information lost in the process) or lossless (no data lost).

Compression is useful because it reduces the resources needed to transmit or store data. In that way, compression is closely related to the general idea of learning.

  • Semantic compression. The ability to convey meaning succinctly is apparent in metaphors, analogies, and reductions to first principles. Although sometimes it can manifest itself in ways that don’t look like language (like 😂). Mnemonics help us remember long combinations. Acronyms help us communicate things faster.
  • Conceptual compression. We learn by distilling a set of observations into a few meaningful concepts. We don’t remember all of our driving lessons specifically, but we remember the general idea.

Every idea in math, science, and beyond has some sort of underlying complexity. The most simple ideas (or code) can be expressed with the fewest steps. This is called the Kolmogorov complexity of the idea or code. It is the length of the shortest program that outputs the idea or code. The Kolmogorov complexity of "AAA" is 2. It can be expressed as a program “print(A*3)”. It cannot be done in less. Any more code is surplus, and any idea that takes more than 2 characters to express has a higher complexity. The Kolmogorov complexity is uncomputable. You can never really know the shortest program to express an idea, but it’s a great thought experiment. It might be useful for everyday learning. What’s the shortest program to understand a topic?

Compression comes from the Latin compressare. Com- meaning together. Pressare means to press. Literally, to press together. Maybe we compress valuable and foundational information together by pressing it into a single act, like “one more thing” in a presentation.