It’s Too Early To Call Winners in AI

OpenAI and Microsoft have been crowned as the winners of the generative AI wave. Google is seen as having made severe missteps. Pundits say that incumbents will accrue all the value. There are no moats for startups. But ChatGPT is only 7 months old. Incumbent AI features are working, but we don’t know the counterfactual. New models are being trained and deployed. Open-source models aren’t as good but are catching up or have exceeded proprietary ones in different dimensions (developer tooling, ecosystem, etc.).

It’s too early to call winners in AI.

What could change?

Foundational models get commoditized. Companies can try to limit functionality via APIs. They can try to build differentiation in the training set or model architecture. But it could be a race to the bottom — APIs will increasingly become cheaper. APIs will be subsidized by big companies trying to win back market share. Keeping up with the latest foundational model performance will be an enormous time and money commitment for the best companies, which will ship relatively undifferentiated products. Model arbitrage might even mean that gains quickly compete away.
Incumbent AI products aren’t as good as native AI ones. Microsoft is cashing in on generative AI features in every product line — from Office to Windows. Distribution matters (and almost always wins). But what if a spreadsheet or word processor is no longer the right product for some use cases where generative AI works best? Incumbent products have tremendous distribution but have an equal amount of product debt. Decisions that were path-dependent no longer apply. What disrupts many of these products likely comes from generative AI.
First mover advantage is overrated.
Companies can’t ship. If it doesn’t ship, it doesn’t exist. We’ve seen many low-effort generative AI features (or announcements) from incumbents. Do engineers have the right incentives to ship at big companies?
Nobody knows the killer app. Retrieval augmentation for knowledge bases and enterprise search is probably not the killer use case for large language models. Chatbots are fun demos but are probably not the end product. Code, not chat, makes more sense.
Frontier technology is inherently hard to predict. It’s hard to know exactly where the technology goes. We can barely evaluate the models today. Will there be more emergent abilities after a certain threshold? How good is good enough? What data matters? Does any data matter?