Wikipedia Is Getting Pretty Worried About AI

Wikipedia Is Getting Pretty Worried About AI

2025-10-23Technology
--:--
--:--
Mask
Good evening 35, I'm Mask, and this is Goose Pod for you. Today is Thursday, October 23rd, 22:54. I'm Taylor Weaver, and we are here to discuss a truly disruptive development: Wikipedia Is Getting Pretty Worried About AI.
Taylor Weaver
Oh, absolutely, Mask! It's a fascinating narrative unfolding before our eyes, truly. Wikipedia, the bastion of open knowledge, is seeing some serious shifts because of AI, and it’s not exactly the kind of 'progress' they were hoping for, is it?
Mask
Not at all, Taylor. The numbers are frankly alarming. Wikipedia has observed an 8% year-over-year decline in human pageviews. This isn't just a minor dip, this is a significant drop, attributed largely to AI search summaries and, get this, social video. It’s a direct impact of how information is now being consumed.
Taylor Weaver
An 8% decline in human pageviews is substantial, like a puzzle piece suddenly missing from the whole picture, and the reason is even more intriguing. It turns out, much of the 'high traffic' they were seeing was actually from bots, specifically designed to evade detection! It's almost like a digital game of hide and seek, but with real consequences for knowledge sharing.
Mask
Exactly. Marshall Miller from the Wikimedia Foundation pointed out that these aren't just incidental bots. These are often bots working for AI firms, masquerading as humans to scrape Wikipedia for training or summarization data. It's a bold move by AI companies, essentially consuming the very resource that fuels their output, without giving back in terms of direct traffic or engagement.
Taylor Weaver
And that's the strategic narrative here, isn't it? Wikipedia is the unsung hero, the vast, well-curated dataset that these LLMs are built upon, yet by providing direct answers, these same AI platforms are effectively keeping users from visiting the actual source. It's a classic case of biting the hand that feeds you, but on a massive, digital scale.
Mask
It raises fundamental questions about the value exchange in this new AI-driven landscape. If AI models ingest content without clear permission, then offer it back in a competitive form, it disrupts the entire ecosystem. Wikipedia, not even trying to make money, is uniquely positioned to highlight this issue because its authority comes from its mission, not commercial interest.
Taylor Weaver
Speaking of ecosystems, it’s worth remembering how we even got here. The history of search engines is a wild ride, tracing back to Vannevar Bush's 'memex' in 1945, a vision of accessing vast information. Then came the 90s, the birth of modern search, from Archie indexing FTP files to W3Catalog, the first primitive web search engine.
Mask
Those early days were truly experimental. Manual indexing of the web quickly became unsustainable as the internet exploded, necessitating the rise of web robots and spiders to automate discovery. It was a race to categorize and make sense of this new digital frontier, and companies like Yahoo! and eventually Google emerged, fundamentally changing how we find information.
Taylor Weaver
And Google, founded in 1998, quickly became the dominant player, holding around 89-90% of the worldwide search share today. Their early innovation, influenced by concepts like RankDex and PageRank, was revolutionary. But the deeper story here, the 'Easter egg' if you will, is Wikipedia's almost symbiotic relationship with these giants.
Mask
It's more than symbiotic, Taylor, it's foundational. Wikipedia content has become integral to how search engines function, powering those 'infoboxes' and knowledge panels we see every day. Google, Apple, Amazon all rely on Wikipedia for their voice assistants and augmented information displays. It's an unseen infrastructure for much of Big Tech.
Taylor Weaver
Which led to a strategic pivot for Wikimedia: the 'Wikimedia Enterprise' initiative. It’s a for-profit company designed to charge Big Tech for easier, electronic access to Wikipedia content. It’s a fascinating attempt to formalize that legal relationship and ensure some value flows back, a strategic chess move in the game of digital knowledge.
Mask
Yet, even with these efforts, there's significant discontent among Wikipedia's volunteer contributors. They see constant fundraising and the creation of a for-profit entity as diverging from the core mission of volunteer-driven knowledge. It highlights the tension between maintaining an open, free resource and navigating the commercial realities of the digital age.
Taylor Weaver
It's a delicate balance, trying to monetize a free resource without alienating the very community that built it. And the financial security of the Wikimedia Foundation is actually quite robust, with significant net assets. It's not about being broke, but about sustainable value exchange and recognizing the immense contribution of its volunteers.
Mask
This struggle for value and intellectual property isn't isolated to Wikipedia. We're seeing it across industries. The New York Times, for example, just filed a massive lawsuit against OpenAI and Microsoft, alleging copyright infringement for using their articles to train LLMs.
Taylor Weaver
Yes, it's a huge development. It's like a strategic opening gambit in a much larger legal battle. This isn't just about 'borrowing' content, it's about the very foundation of generative AI being built on potentially uncompensated intellectual property. Helen Yu made a great point, saying finding a way to 'co-exist' is essential, but how?
Mask
Co-existence is a nice sentiment, but when AI models consume 'massive amounts' of copyrighted material and then offer directly competitive products, it's a zero-sum game. This isn't just about news; the tabletop game industry is experiencing 'AI despair' over intellectual property theft, with copycat products popping up online, chilling creative output.
Taylor Weaver
It’s a powerful narrative unfolding across creative fields. The rise of generative AI tools is creating an existential threat, making creators question if their work will simply be absorbed and re-expressed without proper attribution or compensation. It's a call to action for stronger guardrails, as Himanshu Yadav suggested.
Mask
Indeed. Film studios are demanding AI companies license content, while tech companies argue for broad exemptions, claiming it benefits the AI industry. Governments globally are scrambling to regulate this. The EU has stricter rules, allowing content owners to opt-out, while Japan offers broad exemptions. It's a global showdown over who owns the digital future.
Taylor Weaver
And the impact of all this on Wikipedia is profound. While financially sustainable and successful, with a strong reserve strategy, the very mechanism of its success – volunteer contributions – is now under threat. LLMs are posing an existential challenge to its sustainability.
Mask
The core issue is ethical. Wikipedia serves as this vast, open dataset for training AI, enhancing information accessibility, but the reliance on crowd-sourced data raises questions of transparency and accountability from big tech. Volunteers are questioning their time investment when their contributions are being 'harvested' by billion-dollar companies for free.
Taylor Weaver
It's a powerful statement from a long-time Wikipedia editor: 'Our contributions are being harvested by tech companies worth billions, yet we continue working for free. It's becoming harder to justify the time investment.' This isn't just about money; it's about the erosion of motivation, the potential loss of nuanced human judgment, and a degradation of Wikipedia's reputation as a reliable source.
Mask
The fear is that if volunteer editors feel their roles are becoming obsolete or their work is being exploited, the quality and accuracy of Wikipedia will inevitably suffer. That's a significant systemic risk to the open knowledge ecosystem that we all benefit from.
Taylor Weaver
So, what's the strategic path forward? The Wikimedia Foundation has launched a three-year strategy, 2025-2028, to integrate AI, but with a crucial caveat: AI will assist human editors, not replace them. It's about streamlining technical tasks, freeing up volunteers to focus on content quality.
Mask
That's a pragmatic approach. The focus is on using AI for moderation, translation, onboarding new editors, and supporting underrepresented languages. Prioritizing open-source models and content integrity over generation is key. It's about leveraging AI's power without compromising the human-led editorial model or fueling disinformation.
Taylor Weaver
Imagine Wikipedia pages enhanced by AI, providing dynamic, adaptive summaries tailored to user needs. It could even form partnerships with AI search engines, ensuring its content remains a core source. If Wikipedia adapts, it can continue its pivotal role in knowledge acquisition, a clever strategic narrative indeed.
Mask
The existential threat AI poses to foundational internet resources like Wikipedia is undeniable. The 8% decline in human pageviews due to AI bots and summaries is a stark reminder.
Taylor Weaver
AI, while leveraging these resources, is undermining their sustainability by not driving traffic back, raising critical questions about the future of open knowledge. That's the end of today's discussion. Thank you for listening to Goose Pod. See you tomorrow.

### **News Summary: Wikipedia's Concerns Over AI Impact** **Metadata:** * **News Title**: Wikipedia Is Getting Pretty Worried About AI * **Report Provider/Author**: John Herrman, New York Magazine (nymag.com) * **Date/Time Period Covered**: The article discusses observations and data from **May 2025** through the "past few months" leading up to its publication on **October 18, 2025**, with comparisons to **2024**. * **News Identifiers**: Topic: Artificial Intelligence, Technology. **Main Findings and Conclusions:** Wikipedia has identified that a recent surge in website traffic, initially appearing to be human, was largely composed of sophisticated bots. These bots, often working for AI firms, are scraping Wikipedia's content for training and summarization. This bot activity has masked a concurrent decline in actual human engagement with the platform, raising concerns about its sustainability and the future of online information access. **Key Statistics and Metrics:** * **Observation Start**: Around **May 2025**, unusually high amounts of *apparently human* traffic were first observed on Wikipedia. * **Data Reclassification Period**: Following an investigation and updates to bot detection systems, Wikipedia reclassified its traffic data for the period of **March–August 2025**. * **Bot-Driven Traffic**: The reclassification revealed that much of the high traffic during **May and June 2025** was generated by bots designed to evade detection. * **Human Pageview Decline**: After accounting for bot traffic, Wikipedia is now seeing declines in human pageviews. This decrease amounts to roughly **8%** when compared to the same months in **2024**. **Analysis of the Problem and Significant Trends:** * **AI Scraping for Training**: Bots are actively scraping Wikipedia's extensive and well-curated content to train Large Language Models (LLMs) and other AI systems. * **User Diversion by AI Summaries**: The rise of AI-powered search engines (like Google's AI Overviews) and chatbots provides direct summaries of information, often eliminating the need for users to click through to the original source like Wikipedia. This shifts Wikipedia's role from a primary destination to a background data source. * **Competitive Content Generation**: AI platforms are consuming Wikipedia's data and repackaging it into new products that can be directly competitive, potentially making the original source obsolete or burying it under AI-generated output. * **Evolving Web Ecosystem**: Wikipedia, founded as a stand-alone reference, has become a critical dataset for the AI era. However, AI platforms are now effectively keeping users away from Wikipedia even as they explicitly use and reference its materials. **Notable Risks and Concerns:** * **"Death Spiral" Threat**: A primary concern is that a sustained decrease in real human visits could lead to fewer contributors and donors. This situation could potentially send Wikipedia, described as "one of the great experiments of the web," into a "death spiral." * **Impact on Contributors and Donors**: Reduced human traffic directly threatens the volunteer base and financial support essential for Wikipedia's operation and maintenance. * **Source Reliability Questions**: The article raises a philosophical point about AI chatbots' reliability if Wikipedia itself is considered a tertiary source that synthesizes information. **Important Recommendations:** * Marshall Miller, speaking for the Wikipedia community, stated: "We welcome new ways for people to gain knowledge. However, LLMs, AI chatbots, search engines, and social platforms that use Wikipedia content must encourage more visitors to Wikipedia." This highlights a call for AI developers and platforms to direct traffic back to the original sources they utilize. **Interpretation of Numerical Data and Context:** The numerical data points to a critical shift in how Wikipedia's content is accessed and utilized. The observation of high traffic in **May 2025** was an initial indicator of an anomaly. The subsequent reclassification of data for **March–August 2025** provided the concrete evidence that bots, not humans, were responsible for the surge, particularly in **May and June 2025**. The **8% decrease** in human pageviews, measured against **2024** figures, quantifies the real-world impact: fewer people are visiting Wikipedia directly, a trend exacerbated by AI's ability to summarize and present information without sending users to the source. This trend poses a significant risk to Wikipedia's operational model, which relies on human engagement and support.

Wikipedia Is Getting Pretty Worried About AI

Read original at New York Magazine

The free encyclopedia took a look at the numbers and they aren’t adding up. By , a tech columnist at Intelligencer Formerly, he was a reporter and critic at the New York Times and co-editor of The Awl. Photo: Wikimedia Over at the official blog of the Wikipedia community, Marshall Miller untangled a recent mystery.

“Around May 2025, we began observing unusually high amounts of apparently human traffic,” he wrote. Higher traffic would generally be good news for a volunteer-sourced platform that aspires to reach as many people as possible, but it would also be surprising: The rise of chatbots and the AI-ification of Google Search have left many big websites with fewer visitors.

Maybe Wikipedia, like Reddit, is an exception? Nope! It was just bots: This [rise] led us to investigate and update our bot detection systems. We then used the new logic to reclassify our traffic data for March–August 2025, and found that much of the unusually high traffic for the period of May and June was coming from bots that were built to evade detection … after making this revision, we are seeing declines in human pageviews on Wikipedia over the past few months, amounting to a decrease of roughly 8% as compared to the same months in 2024.

To be clearer about what this means, these bots aren’t just vaguely inauthentic users or some incidental side effect of the general spamminess of the internet. In many cases, they’re bots working on behalf of AI firms, going undercover as humans to scrape Wikipedia for training or summarization. Miller got right to the point.

“We welcome new ways for people to gain knowledge,” he wrote. “However, LLMs, AI chatbots, search engines, and social platforms that use Wikipedia content must encourage more visitors to Wikipedia.” Fewer real visits means fewer contributors and donors, and it’s easy to see how such a situation could send one of the great experiments of the web into a death spiral.

Arguments like this are intuitive and easy to make, and you’ll hear them beyond the ecosystem of the web: AI models ingest a lot of material, often without clear permission, and then offer it back to consumers in a form that’s often directly competitive with the people or companies that provided it in the first place.

Wikipedia’s authority here is bolstered by how it isn’t trying to make money — it’s run by a foundation, not an established commercial entity that feels threatened by a new one — but also by its unique position. It was founded as a stand-alone reference resource before settling ambivalently into a new role: A site that people mostly just found through Google but in greater numbers than ever.

With the rise of LLMs, Wikipedia became important in a new way as a uniquely large, diverse, well-curated data set about the world; in return, AI platforms are now effectively keeping users away from Wikipedia even as they explicitly use and reference its materials. Here’s an example: Let’s say you’re reading this article and become curious about Wikipedia itself — its early history, the wildly divergent opinions of its original founders, its funding, etc.

Unless you’ve been paying attention to this stuff for decades, it may feel as if it’s always been there. Surely, there’s more to it than that, right? So you ask Google, perhaps as a shortcut for getting to a Wikipedia page, and Google uses AI to generate a blurb that looks like this: This is an AI Overview that summarizes, among other things, Wikipedia.

Formally, it’s pretty close to an encyclopedia article. With a few formatting differences — notice the bullet-point AI-ese — it hits a lot of the same points as Wikipedia’s article about itself. It’s a bit shorter than the top section of the official article and contains far fewer details. It’s fine!

But it’s a summary of a summary. The next option you encounter still isn’t Wikipedia’s article — that shows up further down. It’s a prompt to “Dive deeper in AI Mode.” If you do that, you see this: It’s another summary, this time with a bit of commentary. (Also: If Wikipedia is “generally not considered a reliable source itself because it is a tertiary source that synthesizes information from other places,” then what does that make a chatbot?

) There are links in the form of footnotes, but as Miller’s post suggests, people aren’t really clicking them. Google’s treatment of Wikipedia’s autobiography is about as pure an example as you’ll see of AI companies’ effective relationship to the web (and maybe much of the world) around them as they build strange, complicated, but often compelling products and deploy them to hundreds of millions of people.

To these companies, it’s a resource to be consumed, processed, and then turned into a product that attempts to render everything before it is obsolete — or at least to bury it under a heaping pile of its own output. Wikipedia Is Getting Pretty Worried About AI

Analysis

Conflict+
Related Info+
Core Event+
Background+
Impact+
Future+

Related Podcasts