Subscribe now

Technology

AI data scrapers are an existential threat to Wikipedia

As AI developers harvest Wikipedia content to train their models, the resulting surge in automated traffic is driving up costs for the non-profit that runs the popular crowdsourced encyclopaedia

By Jeremy Hsu

4 April 2025

Wikipedia is under threat from the AI boom

Chris Dorney / Alamy

Wikipedia is one of the greatest knowledge resources ever assembled, containing crowdsourced contributions from millions of humans worldwide – and it faces a growing threat from artificial intelligence developers.

The non-profit Wikimedia Foundation, which operates Wikipedia, says since January 2024 it has seen a 50 per cent increase in network traffic requesting image and video downloads from its catalogue. That surge mostly comes from automated data scraper programs, which developers use to collect training data for their AI models.…

Sign up to our weekly newsletter

Receive a weekly dose of discovery in your inbox! We'll also keep you up to date with New Scientist events and special offers.

Sign up

To continue reading, subscribe today with our introductory offers

Piano Exit Overlay Banner Mobile Piano Exit Overlay Banner Desktop