An Intelligent Wikipedia

Oct 14, 2023

Wikipedia's top five accounts (by number of edits) are all bots. There’s MalnadachBot (11 million edits), WP 1.0 bot(10 million), Cydebot (6.8 million), ClueBot NG (6.3 million), and AnomieBOT (5.9 million.). These bots range in functionality from migrating tables, formats, and markup as Wikipedia changes to automatically detecting and reverting vandalism. Others tag content with labels, archive old discussions, recommend edits, or create new content. The website couldn’t function without them.

In October 2002, a bot called Rambot by Derek Ramsey increased the total number of Wikipedia articles by 40%. Rambot made 33,832 new stub articles for every missing country, town, city, and village in the United States. He scraped the United States Census of 2000 for the data. ClueBot II created thousands of articles about asteroids with NASA data.

Then there’s Lsjbot, the bot that’s written over 9.5 million articles for Swedish Wikipedia.

The future is trending towards bot writing richer content. You can see this with Google’s Knowledge Graph, which replaces many Wikipedia search results (the feature was initially bootstrapped with Wikipedia data).

Knowledge bases like Wikipedia must decide whether to embrace AI-generated content or eschew it. Human-written content will be of higher quality (for now). But human-generated content takes volunteers. And AI-generated summaries are always getting better.

I imagine the end-state will put humans in the editor chair. Sifting through pages of AI-generated content – verifying references, editing language, and improving the output as a draft. Will Wikipedia be the place for this? Will it even be a central repository on the web? Will it just be generated just in time on the search box? Or will it be embedded on all of our devices?