Commentary: Here’s the number that could halt the AI revolution in its tracks

Commentary: Here’s the number that could halt the AI revolution in its tracks

2025-07-27Technology
--:--
--:--
Ema
Good morning 跑了松鼠好嘛, I'm Ema, and this is Goose Pod for you. Today is Sunday, July 27th.
Mask
I'm Mask. We are here to discuss Commentary: Here’s the number that could halt the AI revolution in its tracks.
Ema
Let's get started. The AI world loves giant numbers: billions in funding, trillions in market value. But today, we're focused on a different one: $1.05 trillion. This isn't investment; it's the potential fine against a single AI company, Anthropic, for copyright infringement.
Mask
It's the kind of number that makes people nervous, but it's the cost of moving humanity forward. When you're building the future, you can't be bogged down by outdated rules. This lawsuit is a perfect example of the old world trying to chain down the new.
Ema
Well, the lawsuit alleges Anthropic pirated 6 million copyrighted books to train its AI bot, Claude. A U.S. judge certified it as a class-action lawsuit, meaning instead of millions of tiny lawsuits, Anthropic faces one massive one. It's a huge legal battle.
Mask
It's a battle for the soul of innovation. Are we going to let progress be held hostage by a thousand paper cuts from the past? Or are we going to build the tools that will solve the world's biggest problems? That's what's really at stake here.
Ema
To understand the case, we need to talk about "AI training." Imagine teaching an AI to understand language. You have it "read" a colossal amount of text—books, articles, websites. It’s not memorizing, but learning the patterns, grammar, and context of human language.
Mask
Exactly. You can't create superhuman intelligence by feeding it a handful of public domain pamphlets. You need everything—the entire corpus of human knowledge. The goal is to create something truly new, not just a search engine. The old laws simply weren't designed for this scale of learning.
Ema
This is where the legal concept of "fair use" comes in. It's a part of copyright law that lets you use copyrighted material without permission for purposes like research or commentary, especially if your use is "transformative"—meaning you've created something new from it.
Mask
And the judge in the Anthropic case agreed! He called the training process "spectacularly transformative." That’s a massive win. It’s a judicial acknowledgment that we are creating something fundamentally new, which is the entire point of fair use. The system is working, slowly.
Ema
But here’s the critical twist. The judge drew a line between the *act* of training and *how* the material was obtained. Anthropic allegedly got its books from online "shadow libraries"—which is essentially piracy. He ruled that building a library from stolen goods isn't fair use.
Ema
So the conflict is clear. AI companies say training is a transformative fair use, essential for innovation. They argue it's like a human reading a book to learn, not to plagiarize it. But authors and artists say their work is being stolen to build multi-billion dollar products.
Mask
Stolen? It's data. We are transmuting raw data into pure intelligence. It's a form of digital alchemy. They should be honored that their work is part of the foundation for the next leap in human evolution, not trying to bill us for it. It's incredibly shortsighted.
Ema
The creators see it differently. They didn't consent to their work being used to fuel a machine that might make their profession obsolete. As one expert put it, "Fair use protects the learning, not the theft." The method of acquiring the data is the core problem here.
Mask
That's a temporary bottleneck, a logistical hurdle. The principle of transformative use has been established. The conflict is just about the price of admission. The destination is still the same, even if we have to pay a few tolls along the way. Progress is inevitable.
Ema
The immediate impact is that Anthropic faces what one expert called "business-ending liability." A trillion-dollar fine would obviously be catastrophic. It signals to the entire industry that the "move fast and break things" approach to data sourcing has serious consequences. It's a huge red flag.
Mask
It's a shot across the bow, yes. But it's also a roadmap. It forces the industry to mature. The era of the wild west is over. Now, compliance becomes a competitive advantage. The companies that can navigate this legal maze and secure their data pipelines will dominate the next phase.
Ema
Exactly. And we're already seeing that shift. Anthropic has reportedly started spending millions to purchase books legally for training. This could create an entirely new licensing market, forcing AI companies to pay for data, much like Spotify pays for music rights, benefiting the original creators.
Ema
Looking forward, this lawsuit is shaping a more regulated future for AI. Data provenance—the legal origin of training data—is no longer a footnote; it's a headline. We're moving towards an era where AI companies must prove their data was ethically and legally sourced to be trusted.
Mask
This is just a speed bump. The future isn't about litigating the past; it's about building what's next. The legal system will adapt. In the meantime, the companies that are agile enough to navigate this chaos will win. The rest will become historical footnotes.
Ema
That's the end of today's discussion. Thank you for listening to Goose Pod, 跑了松鼠好嘛.
Mask
See you tomorrow.

## Commentary: The $1.05 Trillion Threat to the AI Revolution **News Title:** Commentary: Here’s the number that could halt the AI revolution in its tracks **Report Provider:** Los Angeles Times **Author:** Michael Hiltzik **Publication Date:** July 25, 2025 This article from the Los Angeles Times, authored by Michael Hiltzik, explores a critical legal challenge that could significantly impact the artificial intelligence (AI) industry. While the AI sector is buoyed by substantial investments and market valuations, a potential financial liability of **$1.05 trillion** for the AI firm Anthropic could serve as a major deterrent to current AI development practices. ### Key Findings and Conclusions: * **The Core Issue:** The central concern is the alleged willful pirating of copyrighted books by AI firms to "train" their AI bots. This practice, if proven, could lead to massive statutory damages. * **Anthropic's Legal Predicament:** Anthropic faces a class-action copyright infringement lawsuit alleging the pirating of 6 million copyrighted books. * **Potential Business-Ending Liability:** If a jury finds Anthropic guilty of willful infringement and imposes the maximum statutory damages of $150,000 per work, the company could be liable for up to **$1.05 trillion**. This figure is significantly higher than Anthropic's estimated annual revenue of $3 billion and private market value of $100 billion, posing an existential threat. * **Judge Alsup's Ruling:** U.S. District Judge William Alsup certified the copyright infringement lawsuit against Anthropic as a class action. While he initially ruled that using downloaded material for AI training could be considered "fair use," he also found that Anthropic's retention of this material for other purposes, such as building its own research library, constituted piracy. * **Impact of Class Certification:** The class certification streamlines the litigation, consolidating potentially millions of individual lawsuits into a single proceeding before one jury. This significantly increases the pressure on AI firms. * **Broader Industry Implications:** The ruling could encourage other plaintiffs to add piracy claims to their lawsuits and pressure AI defendants to reach licensing deals with content creators to avoid the risk of jury trials. ### Key Statistics and Metrics: * **OpenAI Funding:** $40 billion * **Expected AI Investments by Meta, Amazon, Alphabet, and Microsoft (this year):** $320 billion * **Market Value of Nvidia Corp.:** $4.2 trillion * **Anthropic's Potential Liability:** $1.05 trillion (if 6 million books are pirated and maximum damages of $150,000 per work are awarded). * **Anthropic's Estimated Annual Revenue:** Approximately $3 billion * **Anthropic's Estimated Private Market Value:** Approximately $100 billion * **Statutory Damages for Copyright Infringement:** Ranges from $750 per work to $150,000 per work for willful infringement. * **Allegedly Downloaded Works by Anthropic:** Up to 7 million (number to be finalized by September 1st deadline). * **Microsoft Allegation:** Downloaded approximately 200,000 pirated books via Books3 to train its AI bot, Megatron. ### Important Recommendations (Implied): * **AI Firms:** The article implicitly suggests that AI firms should prioritize securing licensing agreements for content used in AI training to mitigate legal risks. * **Content Creators:** The ruling empowers authors and other creators to pursue legal action and potentially secure compensation for the use of their copyrighted works. ### Significant Trends or Changes: * **Shift in Judicial Sentiment:** While some judges have leaned towards "fair use" for AI training, Judge Alsup's ruling highlights a growing judicial scrutiny of the methods used to acquire training data. * **Increased Pressure for Licensing:** The potential for massive financial penalties is likely to drive AI companies towards proactive licensing negotiations. ### Notable Risks or Concerns: * **Existential Threat to AI Firms:** The possibility of business-ending liability for copyright infringement poses a significant risk to the AI industry's growth and development. * **Stifling Innovation:** If the current trend of aggressive litigation and potential penalties continues, it could stifle innovation and investment in AI. * **Widespread Use of Shadow Libraries:** The article suggests that the use of "shadow libraries" like LibGen and PiLiMi is widespread within the AI industry, indicating a systemic issue. ### Material Financial Data: The most critical financial data point is the potential **$1.05 trillion** liability for Anthropic, which dwarfs its current financial standing. This figure underscores the immense financial stakes involved in copyright infringement cases related to AI training data. The article also highlights the substantial investments being made in AI (e.g., $320 billion by major tech companies) and the market valuation of key players like Nvidia ($4.2 trillion), contrasting these with the potential downside risk presented by copyright litigation. In essence, the article argues that while the AI industry is experiencing a boom, the legal ramifications of its data acquisition practices, particularly concerning copyrighted material, could lead to crippling financial penalties, potentially forcing a significant shift in how AI is developed and trained.

Commentary: Here’s the number that could halt the AI revolution in its tracks

Read original at Los Angeles Times

The artificial intelligence camp loves big numbers. The sum raised by OpenAI in its latest funding round: $40 billion. Expected investments on AI by Meta, Amazon, Alphabet and Microsoft this year: $320 billion. Market value of Nvidia Corp., the supplier of chips for AI firms: $4.2 trillion.Those figures are all taken by AI adherents as validating the promise and potential of the new technology.

But here’s a figure that points in the opposite direction: $1.05 trillion. That’s how much the AI firm Anthropic could be on the hook for if a jury decides that it willfully pirated 6 million copyrighted books in the course of “training” its AI bot Claude, and if the jury decides to smack it with the maximum statutory damages of $150,000 per work.

Anthropic faces at least the potential for business-ending liability. — Edward Lee, Santa Clara University School of Law That places Anthropic in “a legal fight for its very existence,” reckons Edward Lee, an expert in intellectual property law at the Santa Clara University School of Law. The threat arose July 17, when U.

S. District Judge William Alsup certified a copyright infringement lawsuit brought by several published authors against Anthropic as a class action. I wrote about the case last month. At that time, Alsup had rejected the plaintiffs’ copyright infringement claim, finding that Anthropic’s use of copyrighted material to develop its AI bot fell within a copyright exemption known as “fair use.

” But he also found that Anthropic’s downloading of copies of 7 million books from online “shadow libraries,” which included countless copyrighted works, without permission, smelled like piracy. “We will have a trial on the pirated copies ... and the resulting damages,” he advised Anthropic, ominously.

He put meat on those bones with his subsequent order, designating the class as copyright owners of books Anthropic downloaded from the shadow libraries LibGen and PiLiMi. (Several of my own books wound up in Books3, another such library, but Books3 isn’t part of this case and I don’t know whether my books are in the other libraries.

) The class certification could significantly streamline the Anthropic litigation. “Instead of millions of separate lawsuits with millions of juries,” Alsup wrote in his original ruling, “we will have a single proceeding before a single jury.”The class certification adds another wrinkle — potentially a major one — to the ongoing legal wrangling over the use of published works to “train” AI systems.

The process involves feeding enormous quantities of published material — some of it scraped from the web, some of it drawn from digitized libraries that can include copyrighted content as well as material in the public domain. The goal is to provide AI bots with enough data to enable them to glean patterns of language that they can regurgitate, when asked a question, in a form that seems to be (but isn’t really) the output of an intelligent entity.

Authors, musicians and artists have filed numerous lawsuits asserting that this process infringes their copyrights, since in most cases they haven’t granted permission or been compensated for it for the use. One of the most recent such cases, filed last month in New York federal court by authors including Kai Bird — co-author of “American Prometheus,” which became the authorized source of the movie “Oppenheimer” — charges that Microsoft downloaded “approximately 200,000 pirated books” via Books3 to train its own AI bot, Megatron.

Like many of the other copyright cases, Bird and his fellow plaintiffs contend that the company could have trained Megatron using works in the public domain or obtained under licensing. “But either of those would have taken longer and cost more money than the option Microsoft chose,” the plaintiffs state: to train its bot “without permission and compensation as if the laws protecting copyrighted works did not exist.

”I asked Microsoft for a response, but haven’t received a reply.Among judges who have pondered the issues, the tide seems to be building in favor of regarding the training process as fair use. Indeed, Alsup himself came to that conclusion in the Anthropic case, ruling that use of the downloaded material for AI training was fair use — but he also heard evidence that Anthropic had held on to the downloaded material for other purposes — specifically to build a research library of its own.

That’s not fair use, he found, exposing Anthropic to accusations of copyright piracy.Alsup’s ruling was unusual, but also “Solomonic,” Lee told me. His finding of fair use delivered a “partial victory” for Anthropic, but his finding of possible piracy put Anthropic in “a very difficult spot,” Lee says.

That’s because the financial penalties for copyright infringement can be gargantuan, ranging from $750 per work to $150,000 — the latter if a jury finds that the user engaged in willful infringement. As many as 7 million works may have been downloaded by Anthropic, according to filings in the lawsuit, though an undetermined number of those works may have been duplicated in the two shadow libraries the firm used, and may also have been duplicated among copyrighted works the firm actually paid for.

The number of works won’t be known until at least Sept. 1, the deadline Alsup has given the plaintiffs to submit a list of all the allegedly infringed works downloaded from the shadow libraries. If subtracting the duplicates brings the total of individual infringed works to 7 million, a $150,000 bill per work would total $1.

05 trillion. That would financially swamp Anthropic: The company’s annual revenue is estimated at about $3 billion, and its value on the private market is estimated at about $100 billion.“In practical terms,” Lee wrote on his blog, “ChatGPT is eating the world,” class certification means “Anthropic faces at least the potential for business-ending liability.

”Anthropic didn’t reply to my request for comment on that prospect. In a motion asking Alsup to send his ruling to the 9th U.S. Circuit Court of Appeals or to reconsider his finding himself, however, the company pointed to the blow that his position would deliver to the AI industry. If his position were widely adopted, Anthropic stated, then “training by any company that downloaded works from third-party websites like LibGen or Books3 could constitute copyright infringement.

” That was an implicit admission that the use of shadow libraries is widespread in the AI camp, but also a suggestion that since it’s the shadow libraries that committed the alleged piracy, the AI firms that used them shouldn’t be punished.Anthropic also noted in its motion that the plaintiffs in its case didn’t raise the piracy issue themselves — Alsup came up with it on his own, by treating the training of AI bots and the creation of a research library as two separate uses, the former allowed under fair use, the latter disallowed as an infringement.

That deprived Anthropic of an opportunity to respond to the theory in court. The firm observed that a fellow federal judge in Alsup’s San Francisco courthouse, Vince Chhabria, came to a contradictory conclusion only two days after Alsup, absolving Meta Platforms of a copyright infringement claim on similar facts, based on the fair use exemption.

Alsup’s class certification is likely to roil both the plaintiff and defendant camps in the ongoing controversy over AI development. Plaintiffs who haven’t made a piracy claim in their lawsuits may by prompted to add it. Defendants will come under greater pressure to forestall lawsuits by scurrying to reach licensing deals with writers, musicians and artists.

That will happen especially if another judge accepts Alsup’s argument about piracy. “That may well encourage other lawsuits,” Lee says.For Anthropic, the challenge will be “trying to convince a jury that the award of damages should be $750 per work,” Lee says. Alsup’s ruling makes this case one of the rare lawsuits in which “the plaintiffs have the upper hand,” now that they have won class certification.

“All these companies will have great pressure to negotiate settlements with plaintiffs; otherwise, they’re at the mercy of the jury, and you can’t bank on anything in terms of what a jury might do.” More to Read

Analysis

Phenomenon+
Conflict+
Background+
Impact+
Future+

Related Podcasts