- Published on
5% of Wikipedia is AI-written
- Authors
- Name
- Jadru
Recent research has revealed a significant increase in AI-generated content on Wikipedia, particularly in newly created articles. Here are the key findings:


Prevalence of AI-Generated Content
A study conducted in 2024 found that approximately 5% of newly created English Wikipedia articles contain significant AI-generated content[1][3]. This estimate is based on an analysis of articles created in August 2024, compared to a baseline of articles created before March 2022 (pre-GPT-3.5 era)[1].
Detection Methods
The researchers used two AI detection tools:
- GPTZero: A proprietary AI detector
- Binoculars: An open-source alternative
These tools were calibrated to achieve a 1% false positive rate on pre-GPT-3.5 articles, ensuring a high level of accuracy in detecting AI-generated content[1][3].
Findings Across Languages
While English Wikipedia showed the highest rate of AI-generated content, lower percentages were observed for other languages:
- English: Over 5% of new articles flagged as AI-generated
- German, French, and Italian: Lower percentages of AI-generated content detected[3]
Characteristics of AI-Generated Articles
The study found that articles flagged as AI-generated typically exhibited:
- Lower quality compared to human-written articles
- Fewer references and footnotes per sentence
- Fewer outgoing links per word
- Often self-promotional or biased towards specific viewpoints on controversial topics[1]
Implications
The rise of AI-generated content on Wikipedia raises concerns about:
- Accountability
- Accuracy
- Bias amplification
- Long-term viability of training language models on internet data[1]
It's important to note that these findings represent a lower bound, and the actual amount of AI-generated content could be higher due to potential false negatives in detection[1].
While AI-generated content poses challenges, some researchers are also exploring ways AI could potentially improve Wikipedia. A separate study found that AI tools might be useful for improving citation quality and verifiability on the platform[5].
Citations:
- [1] https://arxiv.org/html/2410.08044v1
- [2] https://www.reddit.com/r/singularity/comments/1g5gxwf/at_least_5_of_new_wikipedia_articles_in_august/
- [3] https://arxiv.org/abs/2410.08044
- [4] https://www.newscientist.com/article/2454256-one-in-20-new-wikipedia-pages-seem-to-be-written-with-the-help-of-ai/
- [5] https://www.newsweek.com/could-ai-help-make-wikipedia-better-1837121