Materializing Results

Sep 9, 2022

Cache invalidation is hard. Even if it's not really "cache invalidation." The problem is that you often want denormalized data from your relational databases. But complex joins and large amounts of data can make those queries expensive (in terms of both time and dollar cost).

The answer is often an incremental approach. A materialized view provides an up-to-date cached table of the denormalized data. They've been around in some form since 1998 (Oracle 8). You can manually implement them with triggers and state functions, but those solutions aren't generalizable.

The industry seems to be backfilling popular database products with support for martialized views. BigQuery added support in 2020. Snowflake

The recent few years of innovation have been built off two papers:

Out of this research, there's been a few different startups (e.g., Readyset, Materialized) that implement a common wire protocol (Postgres/MySQL) and add support for materialized views via one of these methods.