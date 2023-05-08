“We Have No Moat, And Neither Does OpenAI,” a supposedly leaked document from Google, makes some interesting points. The competitive landscape shifts, and so do the moats.

What is no longer a moat

Data is no longer a moat. For example, GPT-3 and Stable Diffusion were trained on public data sets by companies or groups with zero proprietary data. Now, model arbitrage captures any difference between publically available models — just use one to generate training data for another. But what about code training data on GitHub? The Pile (the dataset used for many of the open-source LLMs) includes more than 192,000 GitHub repositories with over 100 stars. Plus, there are many more ways than other LLMs to generate synthetic training data for code.

Foundational models are no longer a moat. I’ve written about this several times over the last 2 and a half years.

The new (old) moats in AI