Commentary: Here’s the number that could halt the AI revolution in its tracks

Commentary: Here’s the number that could halt the AI revolution in its tracks

2025-07-28Technology
--:--
--:--
Aura Windfall
Good morning 徐国荣, I'm Aura Windfall, and this is Goose Pod for you. Today is Tuesday, July 29th.
Mask
And I'm Mask. Today we're diving into a number so big it could stop the entire AI revolution cold: $1.05 trillion. This isn't revenue; it's a potential fine.
Aura Windfall
Let's get started. It's truly a staggering figure. We're so used to hearing these astronomical numbers associated with AI, but they're usually positive, right? They speak to growth and potential, like the billions invested in companies like OpenAI or the trillions in market value for chipmakers like Nvidia.
Mask
Exactly. Those are vanity numbers. This one is different. This is a threat. $1.05 trillion is the maximum statutory damages the AI firm Anthropic could face if a jury finds they willfully pirated millions of copyrighted books to train their AI, Claude. This isn't just a bump in the road; it's a potential apocalypse for them.
Aura Windfall
And it brings up such a fundamental question about this new world we're building. To create these incredible AI tools, companies feed them enormous amounts of information—the process they call "training." But what happens when that information is copyrighted art, literature, and journalism?
Mask
That's the entire battleground. The AI needs to read everything to learn, but "everything" includes works that people own. Authors, artists, and musicians have filed a barrage of lawsuits, all asking the same basic question: Can you just take our work without asking or paying?
Aura Windfall
What I know for sure is that this feels like a moment of reckoning. It’s a collision between the speed of technology and the principles of ownership and creativity. This lawsuit against Anthropic, which was recently certified as a class action, puts that conflict in the starkest possible terms.
Mask
It streamlines the fight. Instead of millions of authors suing one by one, you get one massive case, one jury. And that jury will be looking at Anthropic's use of so-called "shadow libraries"—online collections of pirated books—to get their training data. It’s a high-stakes legal drama.
Aura Windfall
It's fascinating because the judge in this case, Judge William Alsup, initially seemed to side with the AI company on one point. He suggested that the simple act of using copyrighted material for training could be considered "fair use," a legal exemption for transformative work.
Mask
But—and this is a trillion-dollar 'but'—he also found that downloading 7 million books from pirate websites without permission smells a lot like piracy. So he’s essentially saying, the learning might be legal, but the way you got the textbooks was not. Now, Anthropic is in a fight for its very existence.
Aura Windfall
It paints such a vivid picture. On one hand, you have this technology that feels like magic, capable of writing and reasoning. On the other, the foundational ingredients for that magic might have been stolen. It really challenges us to think about the ethics of innovation.
Mask
Ethics, sure, but this is about cold, hard cash and survival. If Anthropic goes down or is crippled by this, every other AI company using similar methods is a target. That's why this number, $1.05 trillion, is the one that truly matters. It represents the potential cost of building the future on a shaky legal foundation.
Aura Windfall
It’s a powerful symbol of the tension at the heart of the AI revolution. We want progress, we want these amazing tools, but we also have to ask ourselves what the true cost is and who ultimately pays the price for it. It’s a conversation about value, and values.
Aura Windfall
To really understand this, we need to talk about the core legal concept here, which is "fair use." It’s this idea in U.S. copyright law that says you can use copyrighted material without permission under certain circumstances, like for criticism, news reporting, or education.
Mask
And AI companies have grabbed onto that idea for dear life. Their argument is: "We aren't plagiarizing books. We are teaching our machine. The AI reads millions of books to understand language, just like a human student. It's transformative, so it's fair use." It’s a bold, and frankly, necessary argument for them.
Aura Windfall
It is bold. And some judges are buying it. In the Anthropic case, and a similar one involving Meta, judges have said that the training process itself seems "spectacularly transformative." They see the AI model as something entirely new, not just a copy of the books it read.
Mask
But the devil is in the details, specifically where the books came from. Anthropic didn't just go to a library. They allegedly downloaded massive datasets from "shadow libraries" like LibGen and PiLiMi. These are pirate sites. They are the digital equivalent of a back-alley bookstore selling stolen goods.
Aura Windfall
And that’s the pivot point in the Anthropic case. Judge Alsup made a distinction that is, as one expert called it, "Solomonic." He essentially split the issue in two. He said the *use* of the material for training was fair use, a victory for Anthropic.
Mask
But the *acquisition* of that material through piracy, and keeping it to build a permanent research library, was not fair use. That was just infringement. So, you get this bizarre situation where the 'what' is legal, but the 'how' is illegal. It puts Anthropic in a terrible spot.
Aura Windfall
It's a crucial distinction. It’s like saying a chef can create a brilliant, transformative new dish, but they can’t use stolen ingredients to do it. The court is focusing on the provenance of the data, where it came from, and whether it was obtained legally.
Mask
This is a new front in the war. It's not just about what the AI produces; it's about its diet. Other lawsuits, like one against Microsoft, make similar claims, alleging they used a pirated dataset called Books3 to train their AI. The plaintiffs say Microsoft could have used public domain works, but piracy was faster and cheaper.
Aura Windfall
This really brings the business model into question. For a long time, the approach in tech has been to move fast and break things. Here, it seems they may have broken federal copyright law on a massive scale, and now the bill is coming due. What I find so interesting is how the courts are trying to apply old laws to this completely new frontier.
Mask
They’re struggling. You have different judges in the same courthouse coming to contradictory conclusions. One judge sides with Meta on fair use, while Judge Alsup creates this nuanced piracy angle against Anthropic. It’s a legal mess because the technology is moving faster than the law can ever hope to keep up.
Aura Windfall
This legal uncertainty itself becomes a huge risk factor. Investors and the companies themselves are operating in a gray area. Anthropic even argued that since using shadow libraries is widespread in the AI industry, they shouldn't be singled out for punishment. It’s a bit of a "everyone was doing it" defense.
Mask
A defense that rarely works in court. The reality is that these AI models are built on a foundation of data. If that foundation is made of stolen materials, the entire structure is at risk of collapsing. This background of piracy and legal battles sets the stage for a major conflict over the future of AI itself.
Aura Windfall
And it creates a powerful incentive for change. One case, *Bartz v. Anthropic*, which was decided in June 2025, really highlights this. The court found that when Anthropic *later* went and legally purchased books to train its models, *that* was a clear case of fair use. It showed they could do it legally if they wanted to.
Mask
Which is a double-edged sword. It proves there’s a legal path forward, but it also proves they knew there was a choice and initially chose the illegal path, which a jury might see as willful infringement. That's how you get to the maximum penalty of $150,000 per work. Multiply that by millions of books, and you see the problem.
Aura Windfall
It's a fascinating and complex backdrop. We have the principle of fair use, the reality of pirated data, and a series of conflicting court rulings all trying to make sense of a technology that is, as one judge put it, "among the most transformative we will see in our lifetimes." The stage is set for a major showdown.
Aura Windfall
And this brings us to the heart of the conflict. On one side, you have the creators—authors, artists, journalists—who see their life's work being ingested by these massive systems without their permission and without compensation. To them, it feels deeply personal and unjust.
Mask
And on the other side, you have the builders, the innovators. They argue that you cannot build these monumental technologies by negotiating licenses for every blog post, every book, every image on the internet. The scale is too vast. They see it as learning, not theft. Progress requires data.
Aura Windfall
What I know for sure is that both sides have a truth. There's the truth of intellectual property and the right to control one's creation. And then there's the truth of innovation, which often involves building upon what came before in new and unexpected ways. The conflict is about which truth prevails.
Mask
This isn't about truth; it's about leverage. Judge Alsup's ruling handed the plaintiffs all the leverage. By condemning the piracy while allowing the principle of training, he created a perfect weapon. He basically told the AI companies: "Your goal is legal, but your methods were not."
Aura Windfall
It's a powerful distinction. He compared AI training to a human reading books to learn, saying it would be "unthinkable" to make someone pay every time they recall something they read. But he also said the tech's transformative nature "doesn't give you a free pass to transform other people's property into your training data through illegal means."
Mask
He’s trying to have it both ways. It's a "Solomon's choice," as one legal expert put it—splitting the baby. But in doing so, he's forcing a massive change in behavior. The era of the data free-for-all, the "Wild West" of scraping anything and everything online, is over. The sheriff has arrived, and he's a federal judge.
Aura Windfall
This creates an intense pressure on the AI defendants. They are now at the mercy of a jury, and juries are unpredictable. The risk of facing a judgment in the billions, or even a trillion dollars, is a powerful motivator to settle and to create licensing frameworks for the future.
Mask
It forces the industry to grow up. The "move fast and break things" mantra hits a brick wall called copyright law. Now, the conflict shifts from the courtroom to the boardroom. The new challenge is: how do we build a sustainable, legal, and ethical data supply chain for AI? That's a much harder problem to solve than just writing code.
Aura Windfall
And it's a conflict that could ultimately benefit everyone. While authors can't completely stop AI training, this ruling pushes the industry toward a model where they get paid. It creates a market, turning what was once seen as piracy into a potential new revenue stream for the entire creative ecosystem.
Aura Windfall
The impact of this is immediate and profound. For Anthropic, the prospect of a trillion-dollar fine is, as one expert stated, a "business-ending liability." It's an existential threat that changes their entire strategic calculus overnight. They have to fight, but they also have to consider settling.
Mask
And the shockwaves extend to every AI startup and every venture capitalist. Before this, the main question for an AI company was "How good is your model?" Now, an equally important question is "How clean is your data?" Data provenance—the origin of your training data—is suddenly a critical part of due diligence.
Aura Windfall
It's a complete shift in mindset. The ethos of "download first, ask permission never" has become incredibly risky. What was once standard practice is now a massive legal and financial liability. It's forcing a new kind of maturity on the industry, where legal compliance is no longer an afterthought.
Mask
Compliance is becoming a brand value. A strategic moat. If you can prove your AI was trained on ethically and legally sourced data, you have a competitive advantage. It builds trust. The impact is that the "Wild West" era is ending, and a more regulated, scrutinized frontier is emerging. It’s a transition from pure tech to socio-technical governance.
Aura Windfall
I think this is a powerful "aha moment" for the tech industry. It's a realization that you can't build the future in a vacuum. The legal and ethical foundations are just as important as the technological ones. If the foundation is unstable, everything built on top of it is at risk.
Mask
This will trigger a wave of change. We'll see more companies scrambling to sign licensing deals, like OpenAI did with Shutterstock or Google with Getty Images. It creates a new market for data. The impact isn't just punitive; it's generative. It's creating a new economy around the data that fuels AI.
Aura Windfall
And that could be a very positive outcome. It suggests a path where innovation doesn't have to come at the expense of creators. Instead, the growth of AI could directly support the authors, artists, and musicians whose work provides the richness and depth that these models learn from. It’s a more sustainable path.
Aura Windfall
So, what does the future hold? This ruling feels like the first big domino to fall. It's likely to encourage more lawsuits. Other authors and creators who haven't made a piracy claim yet will probably add it to their cases, putting even more pressure on AI companies.
Mask
The future is lawyers. And licensing deals. Companies will be forced to the negotiating table. The race is on to build clean datasets and secure the rights for training data. This will slow things down and make AI development more expensive, but it's the only way to de-risk the business model.
Aura Windfall
We might also see new laws. There's already a "Generative AI Copyright Disclosure Act" proposed in the U.S. Congress. The law is playing catch-up, trying to create clear rules for this new territory. It's a recognition that court cases alone might not be enough to create a stable framework.
Mask
It's a global puzzle. The U.S. is wrestling with fair use, the EU is implementing transparency rules and opt-out rights for creators, and China is imposing its own top-down regulations. There's no single answer, which creates a complex landscape for any company operating worldwide. The future is a patchwork of regulations.
Aura Windfall
But what I hope for, what I know is possible, is a future of sustainable equilibrium. A future where AI companies pay for the value they get from creative works, and in turn, the creative economy is strengthened. Where technology and humanity find a way to enrich each other.
Aura Windfall
That's the end of today's discussion. The Anthropic case has drawn a line in the sand, forcing a confrontation between the limitless appetite of AI and the fundamental rights of creators. Thank you for listening to Goose Pod.
Mask
How this battle resolves will shape the next decade of technology. The number isn't just $1.05 trillion; it's the price of building the future responsibly. See you tomorrow.

## Commentary: The $1.05 Trillion Threat to the AI Revolution **News Title:** Commentary: Here’s the number that could halt the AI revolution in its tracks **Report Provider:** Los Angeles Times **Author:** Michael Hiltzik **Publication Date:** July 25, 2025 This article from the Los Angeles Times, authored by Michael Hiltzik, explores a critical legal challenge that could significantly impact the artificial intelligence (AI) industry. While the AI sector is buoyed by substantial investments and market valuations, a potential financial liability of **$1.05 trillion** for the AI firm Anthropic could serve as a major deterrent to current AI development practices. ### Key Findings and Conclusions: * **The Core Issue:** The central concern is the alleged willful pirating of copyrighted books by AI firms to "train" their AI bots. This practice, if proven, could lead to massive statutory damages. * **Anthropic's Legal Predicament:** Anthropic faces a class-action copyright infringement lawsuit alleging the pirating of 6 million copyrighted books. * **Potential Business-Ending Liability:** If a jury finds Anthropic guilty of willful infringement and imposes the maximum statutory damages of $150,000 per work, the company could be liable for up to **$1.05 trillion**. This figure is significantly higher than Anthropic's estimated annual revenue of $3 billion and private market value of $100 billion, posing an existential threat. * **Judge Alsup's Ruling:** U.S. District Judge William Alsup certified the copyright infringement lawsuit against Anthropic as a class action. While he initially ruled that using downloaded material for AI training could be considered "fair use," he also found that Anthropic's retention of this material for other purposes, such as building its own research library, constituted piracy. * **Impact of Class Certification:** The class certification streamlines the litigation, consolidating potentially millions of individual lawsuits into a single proceeding before one jury. This significantly increases the pressure on AI firms. * **Broader Industry Implications:** The ruling could encourage other plaintiffs to add piracy claims to their lawsuits and pressure AI defendants to reach licensing deals with content creators to avoid the risk of jury trials. ### Key Statistics and Metrics: * **OpenAI Funding:** $40 billion * **Expected AI Investments by Meta, Amazon, Alphabet, and Microsoft (this year):** $320 billion * **Market Value of Nvidia Corp.:** $4.2 trillion * **Anthropic's Potential Liability:** $1.05 trillion (if 6 million books are pirated and maximum damages of $150,000 per work are awarded). * **Anthropic's Estimated Annual Revenue:** Approximately $3 billion * **Anthropic's Estimated Private Market Value:** Approximately $100 billion * **Statutory Damages for Copyright Infringement:** Ranges from $750 per work to $150,000 per work for willful infringement. * **Allegedly Downloaded Works by Anthropic:** Up to 7 million (number to be finalized by September 1st deadline). * **Microsoft Allegation:** Downloaded approximately 200,000 pirated books via Books3 to train its AI bot, Megatron. ### Important Recommendations (Implied): * **AI Firms:** The article implicitly suggests that AI firms should prioritize securing licensing agreements for content used in AI training to mitigate legal risks. * **Content Creators:** The ruling empowers authors and other creators to pursue legal action and potentially secure compensation for the use of their copyrighted works. ### Significant Trends or Changes: * **Shift in Judicial Sentiment:** While some judges have leaned towards "fair use" for AI training, Judge Alsup's ruling highlights a growing judicial scrutiny of the methods used to acquire training data. * **Increased Pressure for Licensing:** The potential for massive financial penalties is likely to drive AI companies towards proactive licensing negotiations. ### Notable Risks or Concerns: * **Existential Threat to AI Firms:** The possibility of business-ending liability for copyright infringement poses a significant risk to the AI industry's growth and development. * **Stifling Innovation:** If the current trend of aggressive litigation and potential penalties continues, it could stifle innovation and investment in AI. * **Widespread Use of Shadow Libraries:** The article suggests that the use of "shadow libraries" like LibGen and PiLiMi is widespread within the AI industry, indicating a systemic issue. ### Material Financial Data: The most critical financial data point is the potential **$1.05 trillion** liability for Anthropic, which dwarfs its current financial standing. This figure underscores the immense financial stakes involved in copyright infringement cases related to AI training data. The article also highlights the substantial investments being made in AI (e.g., $320 billion by major tech companies) and the market valuation of key players like Nvidia ($4.2 trillion), contrasting these with the potential downside risk presented by copyright litigation. In essence, the article argues that while the AI industry is experiencing a boom, the legal ramifications of its data acquisition practices, particularly concerning copyrighted material, could lead to crippling financial penalties, potentially forcing a significant shift in how AI is developed and trained.

Commentary: Here’s the number that could halt the AI revolution in its tracks

Read original at Los Angeles Times

The artificial intelligence camp loves big numbers. The sum raised by OpenAI in its latest funding round: $40 billion. Expected investments on AI by Meta, Amazon, Alphabet and Microsoft this year: $320 billion. Market value of Nvidia Corp., the supplier of chips for AI firms: $4.2 trillion.Those figures are all taken by AI adherents as validating the promise and potential of the new technology.

But here’s a figure that points in the opposite direction: $1.05 trillion. That’s how much the AI firm Anthropic could be on the hook for if a jury decides that it willfully pirated 6 million copyrighted books in the course of “training” its AI bot Claude, and if the jury decides to smack it with the maximum statutory damages of $150,000 per work.

Anthropic faces at least the potential for business-ending liability. — Edward Lee, Santa Clara University School of Law That places Anthropic in “a legal fight for its very existence,” reckons Edward Lee, an expert in intellectual property law at the Santa Clara University School of Law. The threat arose July 17, when U.

S. District Judge William Alsup certified a copyright infringement lawsuit brought by several published authors against Anthropic as a class action. I wrote about the case last month. At that time, Alsup had rejected the plaintiffs’ copyright infringement claim, finding that Anthropic’s use of copyrighted material to develop its AI bot fell within a copyright exemption known as “fair use.

” But he also found that Anthropic’s downloading of copies of 7 million books from online “shadow libraries,” which included countless copyrighted works, without permission, smelled like piracy. “We will have a trial on the pirated copies ... and the resulting damages,” he advised Anthropic, ominously.

He put meat on those bones with his subsequent order, designating the class as copyright owners of books Anthropic downloaded from the shadow libraries LibGen and PiLiMi. (Several of my own books wound up in Books3, another such library, but Books3 isn’t part of this case and I don’t know whether my books are in the other libraries.

) The class certification could significantly streamline the Anthropic litigation. “Instead of millions of separate lawsuits with millions of juries,” Alsup wrote in his original ruling, “we will have a single proceeding before a single jury.”The class certification adds another wrinkle — potentially a major one — to the ongoing legal wrangling over the use of published works to “train” AI systems.

The process involves feeding enormous quantities of published material — some of it scraped from the web, some of it drawn from digitized libraries that can include copyrighted content as well as material in the public domain. The goal is to provide AI bots with enough data to enable them to glean patterns of language that they can regurgitate, when asked a question, in a form that seems to be (but isn’t really) the output of an intelligent entity.

Authors, musicians and artists have filed numerous lawsuits asserting that this process infringes their copyrights, since in most cases they haven’t granted permission or been compensated for it for the use. One of the most recent such cases, filed last month in New York federal court by authors including Kai Bird — co-author of “American Prometheus,” which became the authorized source of the movie “Oppenheimer” — charges that Microsoft downloaded “approximately 200,000 pirated books” via Books3 to train its own AI bot, Megatron.

Like many of the other copyright cases, Bird and his fellow plaintiffs contend that the company could have trained Megatron using works in the public domain or obtained under licensing. “But either of those would have taken longer and cost more money than the option Microsoft chose,” the plaintiffs state: to train its bot “without permission and compensation as if the laws protecting copyrighted works did not exist.

”I asked Microsoft for a response, but haven’t received a reply.Among judges who have pondered the issues, the tide seems to be building in favor of regarding the training process as fair use. Indeed, Alsup himself came to that conclusion in the Anthropic case, ruling that use of the downloaded material for AI training was fair use — but he also heard evidence that Anthropic had held on to the downloaded material for other purposes — specifically to build a research library of its own.

That’s not fair use, he found, exposing Anthropic to accusations of copyright piracy.Alsup’s ruling was unusual, but also “Solomonic,” Lee told me. His finding of fair use delivered a “partial victory” for Anthropic, but his finding of possible piracy put Anthropic in “a very difficult spot,” Lee says.

That’s because the financial penalties for copyright infringement can be gargantuan, ranging from $750 per work to $150,000 — the latter if a jury finds that the user engaged in willful infringement. As many as 7 million works may have been downloaded by Anthropic, according to filings in the lawsuit, though an undetermined number of those works may have been duplicated in the two shadow libraries the firm used, and may also have been duplicated among copyrighted works the firm actually paid for.

The number of works won’t be known until at least Sept. 1, the deadline Alsup has given the plaintiffs to submit a list of all the allegedly infringed works downloaded from the shadow libraries. If subtracting the duplicates brings the total of individual infringed works to 7 million, a $150,000 bill per work would total $1.

05 trillion. That would financially swamp Anthropic: The company’s annual revenue is estimated at about $3 billion, and its value on the private market is estimated at about $100 billion.“In practical terms,” Lee wrote on his blog, “ChatGPT is eating the world,” class certification means “Anthropic faces at least the potential for business-ending liability.

”Anthropic didn’t reply to my request for comment on that prospect. In a motion asking Alsup to send his ruling to the 9th U.S. Circuit Court of Appeals or to reconsider his finding himself, however, the company pointed to the blow that his position would deliver to the AI industry. If his position were widely adopted, Anthropic stated, then “training by any company that downloaded works from third-party websites like LibGen or Books3 could constitute copyright infringement.

” That was an implicit admission that the use of shadow libraries is widespread in the AI camp, but also a suggestion that since it’s the shadow libraries that committed the alleged piracy, the AI firms that used them shouldn’t be punished.Anthropic also noted in its motion that the plaintiffs in its case didn’t raise the piracy issue themselves — Alsup came up with it on his own, by treating the training of AI bots and the creation of a research library as two separate uses, the former allowed under fair use, the latter disallowed as an infringement.

That deprived Anthropic of an opportunity to respond to the theory in court. The firm observed that a fellow federal judge in Alsup’s San Francisco courthouse, Vince Chhabria, came to a contradictory conclusion only two days after Alsup, absolving Meta Platforms of a copyright infringement claim on similar facts, based on the fair use exemption.

Alsup’s class certification is likely to roil both the plaintiff and defendant camps in the ongoing controversy over AI development. Plaintiffs who haven’t made a piracy claim in their lawsuits may by prompted to add it. Defendants will come under greater pressure to forestall lawsuits by scurrying to reach licensing deals with writers, musicians and artists.

That will happen especially if another judge accepts Alsup’s argument about piracy. “That may well encourage other lawsuits,” Lee says.For Anthropic, the challenge will be “trying to convince a jury that the award of damages should be $750 per work,” Lee says. Alsup’s ruling makes this case one of the rare lawsuits in which “the plaintiffs have the upper hand,” now that they have won class certification.

“All these companies will have great pressure to negotiate settlements with plaintiffs; otherwise, they’re at the mercy of the jury, and you can’t bank on anything in terms of what a jury might do.” More to Read

Analysis

Phenomenon+
Conflict+
Background+
Impact+
Future+

Related Podcasts

Commentary: Here’s the number that could halt the AI revolution in its tracks | Goose Pod | Goose Pod