Published on

5% of Wikipedia is AI-written

Authors
  • avatar
    Name
    Jadru
    Twitter

Recent research has revealed a significant increase in AI-generated content on Wikipedia, particularly in newly created articles. Here are the key findings:

WIKI1
WIKI2

Prevalence of AI-Generated Content

A study conducted in 2024 found that approximately 5% of newly created English Wikipedia articles contain significant AI-generated content[1][3]. This estimate is based on an analysis of articles created in August 2024, compared to a baseline of articles created before March 2022 (pre-GPT-3.5 era)[1].

Detection Methods

The researchers used two AI detection tools:

  1. GPTZero: A proprietary AI detector
  2. Binoculars: An open-source alternative

These tools were calibrated to achieve a 1% false positive rate on pre-GPT-3.5 articles, ensuring a high level of accuracy in detecting AI-generated content[1][3].

Findings Across Languages

While English Wikipedia showed the highest rate of AI-generated content, lower percentages were observed for other languages:

  • English: Over 5% of new articles flagged as AI-generated
  • German, French, and Italian: Lower percentages of AI-generated content detected[3]

Characteristics of AI-Generated Articles

The study found that articles flagged as AI-generated typically exhibited:

  1. Lower quality compared to human-written articles
  2. Fewer references and footnotes per sentence
  3. Fewer outgoing links per word
  4. Often self-promotional or biased towards specific viewpoints on controversial topics[1]

Implications

The rise of AI-generated content on Wikipedia raises concerns about:

  1. Accountability
  2. Accuracy
  3. Bias amplification
  4. Long-term viability of training language models on internet data[1]

It's important to note that these findings represent a lower bound, and the actual amount of AI-generated content could be higher due to potential false negatives in detection[1].

While AI-generated content poses challenges, some researchers are also exploring ways AI could potentially improve Wikipedia. A separate study found that AI tools might be useful for improving citation quality and verifiability on the platform[5].

Citations: