AI data misuse faces new restrictions… “Proper citation and paid use required”

The Wikimedia Foundation, which operates Wikipedia, has asked AI companies to use its paid service, “Wikimedia Enterprise,” instead of scraping the site’s content without permission.

Wikipedia logo
Wikipedia logo

Unauthorized AI scraping spreading—“Server strain becoming serious”

According to U.S. tech outlet TechCrunch, the Wikimedia Foundation recently urged AI developers to obtain data “through the official paid service.”

Many generative AI developers, especially those building large language models (LLMs), have been collecting vast amounts of Wikipedia content without consent to use as training data.

Because high-performing AI models require large volumes of reliable and objective information, Wikipedia has become one of the most widely used sources.

However, this surge in automated scraping has increased server loads, and human user visits have dropped by about 8% compared to last year.

The Foundation also revealed that some bots have tried to “disguise themselves as human users” to bypass detection systems.

“AI should pay and properly credit sources”

The Wikimedia Foundation explained that using the paid Enterprise platform allows AI companies to access large datasets more stably while easing the burden on Wikipedia’s servers.

It also emphasized that AI systems must clearly indicate their information sources whenever they quote or use Wikipedia content.

“For people to trust what they read online, platforms must clearly cite their sources,” the Foundation said.

“Users should be able to directly visit the original source if they wish.”

“Declining user visits threaten Wikipedia’s ecosystem”

The Foundation warned that a drop in human visits could reduce the amount of new content being created and weaken the volunteer ecosystem that sustains Wikipedia.

“If human users decrease, so will volunteers and donors—the core driving force behind Wikipedia,” it said.

Ultimately, amid the rapid expansion of the AI industry, Wikipedia is asserting new principles of paid access and proper attribution to preserve its role as a sustainable public knowledge platform.

By Choi Song-aㅣchoesonga627@gmail.com

저작권자 © KMJ 무단전재 및 재배포 금지