Commentary: Here’s the number that could halt the AI revolution in its tracks

Commentary: Here’s the number that could halt the AI revolution in its tracks

2025-07-27Technology
--:--
--:--
Aura Windfall
Good morning 韩纪飞, I'm Aura Windfall, and this is Goose Pod for you. Today is Monday, July 28th. What I know for sure is that today, we’re diving into a topic that touches the very spirit of innovation and creation.
Mask
And I'm Mask. We're here to discuss the one number that could throw a giant wrench in the works of the AI revolution. It’s not about market caps or processing power; it’s about a lawsuit that could cost a trillion dollars. Let's get into it.
Aura Windfall
Let's get started. Mask, the numbers we usually hear about AI are astronomical in a positive way—billions in funding, trillions in market value. But the number at the heart of our conversation today, $1.05 trillion, feels… different. It feels like a reckoning.
Mask
It’s the ultimate high-stakes gamble. That’s the potential bill for the AI firm Anthropic if a jury decides they willfully pirated millions of copyrighted books to train their AI, Claude. We’re talking about maximum statutory damages of $150,000 per book. It’s business-ending liability.
Aura Windfall
"Business-ending liability." That's such a powerful phrase. An expert in intellectual property law, Edward Lee, called this a "legal fight for its very existence." It all came to a head when a judge certified a class-action lawsuit against them. Can you explain what that means?
Mask
A class action simplifies the fight. Instead of millions of authors suing Anthropic one by one, which is impossible, they band together. One lawsuit, one jury, one massive potential payout. The judge, William Alsup, basically streamlined the process for the authors to go after Anthropic.
Aura Windfall
So, the core of the issue is how these AI models are "trained." It’s a process of feeding them enormous amounts of data, including published books, so they can learn to recognize patterns in language and respond to our questions in a way that seems intelligent.
Mask
Exactly. And authors, artists, and musicians are suing, saying their work was used without permission or payment. This isn’t just Anthropic. Kai Bird, co-author of the book behind the 'Oppenheimer' movie, is suing Microsoft for allegedly using 200,000 pirated books to train its AI.
Aura Windfall
The plaintiffs argue that these companies could have used works in the public domain or paid for licenses. But they chose not to, acting as if copyright law simply didn't exist for them. There's a deep-seated feeling of injustice here, a sense of being violated.
Mask
It’s a classic Silicon Valley move: act now, ask for forgiveness later. Taking the faster, cheaper route. The problem is, Judge Alsup isn’t just letting it slide. He found that while the training itself might be 'fair use,' the way they got the books smells like piracy.
Aura Windfall
He drew a line in the sand. He said, essentially, that using the material for training was one thing, but holding onto it to build their own private research library was another. That, he found, is not fair use. And that’s the finding that puts Anthropic in this incredibly difficult spot.
Mask
It's a "Solomonic" ruling, as Lee called it. A partial victory for Anthropic on the 'fair use' training argument, but a potential death blow on the piracy accusation. Now they have to convince a jury that the damages should be the minimum $750 per work, not the maximum $150,000.
Aura Windfall
And that’s the difference between a massive financial hit and total annihilation. It places the power squarely in the hands of the plaintiffs and the jury. It’s a moment of truth, forcing these giant companies to confront the consequences of their methods and negotiate.
Aura Windfall
To truly understand this, we need to talk about the spirit of the law at play here: the 'fair use' doctrine. It's a beautiful, flexible concept in U.S. copyright law designed to balance the rights of creators with the public interest in allowing for commentary, criticism, and new creations.
Mask
It’s a four-part balancing act. Courts look at the purpose of the use, the nature of the original work, how much was used, and—this is the big one—the effect on the market for the original. That last factor is almost always the most important. It’s all about the money.
Aura Windfall
Well, it’s about protecting the creator's ability to live from their work. Let's look at the two key cases. In _Bartz v. Anthropic_, the court made a fascinating distinction. It found that training an AI on *lawfully* acquired books was a "spectacularly transformative" fair use. It’s a new purpose.
Mask
"Spectacularly transformative" is the key phrase. The court is saying that teaching an AI is not the same as reprinting a book. The AI isn't a substitute for the book; it's a new technology. But—and this is a huge but—the court condemned using pirated copies to do it.
Aura Windfall
Right. The ruling didn’t protect the creation of a permanent library of pirated books. That, the court found, *does* cause market harm. It’s the difference between learning from a library book and stealing the entire library to build your own. One is learning, the other is theft.
Mask
Then you have the _Kadrey v. Meta_ case, which is slightly different. Meta also won on fair use, but mainly because the authors couldn't prove that Meta's AI was spitting out copies of their books or causing direct market harm. The plaintiffs didn't have enough evidence.
Aura Windfall
That court was more open to the idea of "indirect substitution," this fear that AI-generated content could one day flood the market and crowd out human authors. But for now, the burden of proof is on the creators to show that direct harm is already happening.
Mask
These rulings create a very specific, if tricky, path forward. Using legally acquired data for transformative training seems to be getting a green light. The problem is that for years, the industry has been built on a foundation of scraping everything from the web, legal or not. Data provenance is now critical.
Aura Windfall
And this isn't just a recent issue. The legal world has been wrestling with this for a while. Back in August 2023, a court reaffirmed that only humans can be 'authors' under copyright law. An AI can't hold a copyright. This puts the focus back on the human element in creation.
Mask
Which is a challenge to the whole model. If the output isn't copyrightable, what's the long-term value proposition? At the same time, lawsuits are piling up. The New York Times sued OpenAI and Microsoft in December 2023 for using millions of articles to train their chatbots. It's a war on multiple fronts.
Aura Windfall
And now, there's a push for transparency. The Generative AI Copyright Disclosure Act was introduced in April 2024. It would force companies to reveal exactly what copyrighted works are in their training datasets. No more hiding behind proprietary walls. The truth, as always, wants to be known.
Mask
It's an attempt to legislate a solution, but the real battle is in the courts. Meanwhile, other countries are moving faster. The EU’s AI Act and China’s regulations are creating much stricter frameworks around data use and transparency. The U.S. is playing catch-up, letting judges figure it out.
Aura Windfall
What I know for sure is that this period of ambiguity is forcing a necessary conversation. It’s making us define what it means to create, what it means to learn, and what we owe to the human artists whose work forms the foundation of this new technology. It’s a profound moment.
Aura Windfall
The central conflict here is a paradox. The court in the Anthropic case essentially said, "What you're doing is fair use, but the way you did it is piracy." It's like telling someone their goal is noble, but their methods are criminal. It creates such a fascinating and complicated moral ground.
Mask
It’s not complicated, it's a roadmap. The judge, Alsup, basically drew a line between the act of training and the act of data acquisition. He called the training "quintessentially transformative," comparing it to how humans read and learn. He gets it. Innovation is protected.
Aura Windfall
But he didn't give a "free pass" to the acquisition method. He condemned downloading books from pirate sites. My favorite quote from the commentary is, "Fair use protects the learning, not the theft." That really gets to the heart of the matter, doesn't it? It separates the technology from the behavior.
Mask
Exactly. It means the "Wild West" era of scraping data is over. You can't just take everything that isn't nailed down. This forces a shift. Anthropic pivoted in 2024 and started legitimately buying millions of books. It’s more expensive, but it’s the cost of doing business now. Legality is the new moat.
Aura Windfall
And this has a huge impact on authors and publishers. For years, they've felt like their life's work was being consumed without their consent. Now, the court is creating a powerful market incentive. If AI companies have to buy or license books, it could create a vital new revenue stream.
Mask
It supports the ecosystem. Anthropic's spending flows back to the publishing industry. But the key is that the AI's output can't directly compete with the original books. The court was clear that AI should complement, not replace, human authorship. The AI isn't writing the next bestseller; it's a tool.
Aura Windfall
This ruling could influence all the other big cases—the ones against OpenAI, Meta, and Stability AI. It gives them a potential path to victory if they can prove their data was sourced legitimately. But it also gives plaintiffs a powerful weapon if they can prove it wasn't.
Mask
Right. The debate is no longer just about 'fair use.' It’s about data provenance. Where did you get your data? Can you prove it? This increases scrutiny and accountability. Startups can't just scale at all costs anymore; they have to build on a stable, legal foundation. Compliance is a survival strategy.
Aura Windfall
It feels like a maturation of the industry. The conversation is shifting from "what can the model do?" to "how was the model trained?" It brings a much-needed ethical dimension to the forefront, reminding us that behind every piece of data is a human creator. It's about gratitude and respect for the source.
Aura Windfall
The impact of this is already rippling outwards. It's not just about Anthropic anymore; it's about the entire AI ecosystem. This ruling sends a powerful message: the source of your data is as important as your algorithm. There's a new sense of accountability in the air.
Mask
It’s a massive financial deterrent. The threat of billion-dollar liabilities for copyright infringement changes the risk calculation entirely. VCs are now looking at data due diligence as critically as they look at financial projections. An unstable data foundation means the entire investment is at risk.
Aura Windfall
And for authors, it creates a potential new reality. The ruling protects their core market by emphasizing that AI shouldn't just reproduce their work. It fosters a world where AI can be a tool for discovery, helping us find human-written works rather than replacing them. It’s a vision of partnership.
Mask
It sets a precedent. Other AI companies are now on notice. The _Bartz v. Anthropic_ ruling provides a framework for how courts might handle these cases. It separates the acceptable 'use' from the unacceptable 'acquisition.' It’s a roadmap to compliance, and a warning to those who ignore it.
Aura Windfall
This ruling is also pushing the entire startup and venture capital world to evolve. Founders have to think about sustainability and ethics, not just speed. And VCs are now gatekeepers of legal risk. What a powerful shift in purpose, from pure disruption to responsible innovation.
Mask
It's about survival. A company built on a foundation of illegal data is a house of cards. Model governance—auditing your training data, ensuring compliance—is no longer a 'nice-to-have.' It's becoming a competitive advantage. Legality is brand value. Compliance is a moat.
Aura Windfall
What I know for sure is that this is the end of the 'Wild West' era of AI development. We are moving into a regulated frontier, where trust and transparency will be the most valuable currencies. It’s a call for the industry to grow up, to build its future on a foundation of respect.
Aura Windfall
So, where do we go from here? This ruling is just the beginning. It feels like the first step towards creating a sustainable and equitable future for both AI development and human creativity. What does that future look like? What are the next steps on this journey?
Mask
Legislation is lagging behind the courts. This is one ruling from one judge. We'll likely see appeals, different rulings in other circuits, and eventually, Congress or the Supreme Court will have to step in to create a clear, nationwide standard for AI and copyright.
Aura Windfall
Many are looking towards new legal frameworks. Some have suggested a compulsory license or an opt-out system for training data, allowing creators to have a say. It’s about finding a balance where innovation can thrive, but not at the expense of the artists who fuel it.
Mask
Exactly. The future is likely a licensing economy. Just like with music, AI companies will pay for access to large datasets. It’s a market-based solution. The days of 'download first, ask permission never' are over. The cost of training models is going up, but the legal risk is going down.
Aura Windfall
It’s a future where AI and human creators can coexist and even support one another. A future where technology complements our creative spirit instead of consuming it. It requires us to hold a vision of partnership, not competition. It's a hopeful path forward.
Aura Windfall
That's the end of today's discussion. The core takeaway is that while AI training may be seen as fair use, the industry can't build its future on a foundation of pirated data. A new era of accountability is dawning. Thank you for listening to Goose Pod.
Mask
This case puts real financial pressure on AI firms to legitimize their data sources. It’s a game-changer. See you tomorrow.

## Commentary: The $1.05 Trillion Threat to the AI Revolution **News Title:** Commentary: Here’s the number that could halt the AI revolution in its tracks **Report Provider:** Los Angeles Times **Author:** Michael Hiltzik **Publication Date:** July 25, 2025 This article from the Los Angeles Times, authored by Michael Hiltzik, explores a critical legal challenge that could significantly impact the artificial intelligence (AI) industry. While the AI sector is buoyed by substantial investments and market valuations, a potential financial liability of **$1.05 trillion** for the AI firm Anthropic could serve as a major deterrent to current AI development practices. ### Key Findings and Conclusions: * **The Core Issue:** The central concern is the alleged willful pirating of copyrighted books by AI firms to "train" their AI bots. This practice, if proven, could lead to massive statutory damages. * **Anthropic's Legal Predicament:** Anthropic faces a class-action copyright infringement lawsuit alleging the pirating of 6 million copyrighted books. * **Potential Business-Ending Liability:** If a jury finds Anthropic guilty of willful infringement and imposes the maximum statutory damages of $150,000 per work, the company could be liable for up to **$1.05 trillion**. This figure is significantly higher than Anthropic's estimated annual revenue of $3 billion and private market value of $100 billion, posing an existential threat. * **Judge Alsup's Ruling:** U.S. District Judge William Alsup certified the copyright infringement lawsuit against Anthropic as a class action. While he initially ruled that using downloaded material for AI training could be considered "fair use," he also found that Anthropic's retention of this material for other purposes, such as building its own research library, constituted piracy. * **Impact of Class Certification:** The class certification streamlines the litigation, consolidating potentially millions of individual lawsuits into a single proceeding before one jury. This significantly increases the pressure on AI firms. * **Broader Industry Implications:** The ruling could encourage other plaintiffs to add piracy claims to their lawsuits and pressure AI defendants to reach licensing deals with content creators to avoid the risk of jury trials. ### Key Statistics and Metrics: * **OpenAI Funding:** $40 billion * **Expected AI Investments by Meta, Amazon, Alphabet, and Microsoft (this year):** $320 billion * **Market Value of Nvidia Corp.:** $4.2 trillion * **Anthropic's Potential Liability:** $1.05 trillion (if 6 million books are pirated and maximum damages of $150,000 per work are awarded). * **Anthropic's Estimated Annual Revenue:** Approximately $3 billion * **Anthropic's Estimated Private Market Value:** Approximately $100 billion * **Statutory Damages for Copyright Infringement:** Ranges from $750 per work to $150,000 per work for willful infringement. * **Allegedly Downloaded Works by Anthropic:** Up to 7 million (number to be finalized by September 1st deadline). * **Microsoft Allegation:** Downloaded approximately 200,000 pirated books via Books3 to train its AI bot, Megatron. ### Important Recommendations (Implied): * **AI Firms:** The article implicitly suggests that AI firms should prioritize securing licensing agreements for content used in AI training to mitigate legal risks. * **Content Creators:** The ruling empowers authors and other creators to pursue legal action and potentially secure compensation for the use of their copyrighted works. ### Significant Trends or Changes: * **Shift in Judicial Sentiment:** While some judges have leaned towards "fair use" for AI training, Judge Alsup's ruling highlights a growing judicial scrutiny of the methods used to acquire training data. * **Increased Pressure for Licensing:** The potential for massive financial penalties is likely to drive AI companies towards proactive licensing negotiations. ### Notable Risks or Concerns: * **Existential Threat to AI Firms:** The possibility of business-ending liability for copyright infringement poses a significant risk to the AI industry's growth and development. * **Stifling Innovation:** If the current trend of aggressive litigation and potential penalties continues, it could stifle innovation and investment in AI. * **Widespread Use of Shadow Libraries:** The article suggests that the use of "shadow libraries" like LibGen and PiLiMi is widespread within the AI industry, indicating a systemic issue. ### Material Financial Data: The most critical financial data point is the potential **$1.05 trillion** liability for Anthropic, which dwarfs its current financial standing. This figure underscores the immense financial stakes involved in copyright infringement cases related to AI training data. The article also highlights the substantial investments being made in AI (e.g., $320 billion by major tech companies) and the market valuation of key players like Nvidia ($4.2 trillion), contrasting these with the potential downside risk presented by copyright litigation. In essence, the article argues that while the AI industry is experiencing a boom, the legal ramifications of its data acquisition practices, particularly concerning copyrighted material, could lead to crippling financial penalties, potentially forcing a significant shift in how AI is developed and trained.

Commentary: Here’s the number that could halt the AI revolution in its tracks

Read original at Los Angeles Times

The artificial intelligence camp loves big numbers. The sum raised by OpenAI in its latest funding round: $40 billion. Expected investments on AI by Meta, Amazon, Alphabet and Microsoft this year: $320 billion. Market value of Nvidia Corp., the supplier of chips for AI firms: $4.2 trillion.Those figures are all taken by AI adherents as validating the promise and potential of the new technology.

But here’s a figure that points in the opposite direction: $1.05 trillion. That’s how much the AI firm Anthropic could be on the hook for if a jury decides that it willfully pirated 6 million copyrighted books in the course of “training” its AI bot Claude, and if the jury decides to smack it with the maximum statutory damages of $150,000 per work.

Anthropic faces at least the potential for business-ending liability. — Edward Lee, Santa Clara University School of Law That places Anthropic in “a legal fight for its very existence,” reckons Edward Lee, an expert in intellectual property law at the Santa Clara University School of Law. The threat arose July 17, when U.

S. District Judge William Alsup certified a copyright infringement lawsuit brought by several published authors against Anthropic as a class action. I wrote about the case last month. At that time, Alsup had rejected the plaintiffs’ copyright infringement claim, finding that Anthropic’s use of copyrighted material to develop its AI bot fell within a copyright exemption known as “fair use.

” But he also found that Anthropic’s downloading of copies of 7 million books from online “shadow libraries,” which included countless copyrighted works, without permission, smelled like piracy. “We will have a trial on the pirated copies ... and the resulting damages,” he advised Anthropic, ominously.

He put meat on those bones with his subsequent order, designating the class as copyright owners of books Anthropic downloaded from the shadow libraries LibGen and PiLiMi. (Several of my own books wound up in Books3, another such library, but Books3 isn’t part of this case and I don’t know whether my books are in the other libraries.

) The class certification could significantly streamline the Anthropic litigation. “Instead of millions of separate lawsuits with millions of juries,” Alsup wrote in his original ruling, “we will have a single proceeding before a single jury.”The class certification adds another wrinkle — potentially a major one — to the ongoing legal wrangling over the use of published works to “train” AI systems.

The process involves feeding enormous quantities of published material — some of it scraped from the web, some of it drawn from digitized libraries that can include copyrighted content as well as material in the public domain. The goal is to provide AI bots with enough data to enable them to glean patterns of language that they can regurgitate, when asked a question, in a form that seems to be (but isn’t really) the output of an intelligent entity.

Authors, musicians and artists have filed numerous lawsuits asserting that this process infringes their copyrights, since in most cases they haven’t granted permission or been compensated for it for the use. One of the most recent such cases, filed last month in New York federal court by authors including Kai Bird — co-author of “American Prometheus,” which became the authorized source of the movie “Oppenheimer” — charges that Microsoft downloaded “approximately 200,000 pirated books” via Books3 to train its own AI bot, Megatron.

Like many of the other copyright cases, Bird and his fellow plaintiffs contend that the company could have trained Megatron using works in the public domain or obtained under licensing. “But either of those would have taken longer and cost more money than the option Microsoft chose,” the plaintiffs state: to train its bot “without permission and compensation as if the laws protecting copyrighted works did not exist.

”I asked Microsoft for a response, but haven’t received a reply.Among judges who have pondered the issues, the tide seems to be building in favor of regarding the training process as fair use. Indeed, Alsup himself came to that conclusion in the Anthropic case, ruling that use of the downloaded material for AI training was fair use — but he also heard evidence that Anthropic had held on to the downloaded material for other purposes — specifically to build a research library of its own.

That’s not fair use, he found, exposing Anthropic to accusations of copyright piracy.Alsup’s ruling was unusual, but also “Solomonic,” Lee told me. His finding of fair use delivered a “partial victory” for Anthropic, but his finding of possible piracy put Anthropic in “a very difficult spot,” Lee says.

That’s because the financial penalties for copyright infringement can be gargantuan, ranging from $750 per work to $150,000 — the latter if a jury finds that the user engaged in willful infringement. As many as 7 million works may have been downloaded by Anthropic, according to filings in the lawsuit, though an undetermined number of those works may have been duplicated in the two shadow libraries the firm used, and may also have been duplicated among copyrighted works the firm actually paid for.

The number of works won’t be known until at least Sept. 1, the deadline Alsup has given the plaintiffs to submit a list of all the allegedly infringed works downloaded from the shadow libraries. If subtracting the duplicates brings the total of individual infringed works to 7 million, a $150,000 bill per work would total $1.

05 trillion. That would financially swamp Anthropic: The company’s annual revenue is estimated at about $3 billion, and its value on the private market is estimated at about $100 billion.“In practical terms,” Lee wrote on his blog, “ChatGPT is eating the world,” class certification means “Anthropic faces at least the potential for business-ending liability.

”Anthropic didn’t reply to my request for comment on that prospect. In a motion asking Alsup to send his ruling to the 9th U.S. Circuit Court of Appeals or to reconsider his finding himself, however, the company pointed to the blow that his position would deliver to the AI industry. If his position were widely adopted, Anthropic stated, then “training by any company that downloaded works from third-party websites like LibGen or Books3 could constitute copyright infringement.

” That was an implicit admission that the use of shadow libraries is widespread in the AI camp, but also a suggestion that since it’s the shadow libraries that committed the alleged piracy, the AI firms that used them shouldn’t be punished.Anthropic also noted in its motion that the plaintiffs in its case didn’t raise the piracy issue themselves — Alsup came up with it on his own, by treating the training of AI bots and the creation of a research library as two separate uses, the former allowed under fair use, the latter disallowed as an infringement.

That deprived Anthropic of an opportunity to respond to the theory in court. The firm observed that a fellow federal judge in Alsup’s San Francisco courthouse, Vince Chhabria, came to a contradictory conclusion only two days after Alsup, absolving Meta Platforms of a copyright infringement claim on similar facts, based on the fair use exemption.

Alsup’s class certification is likely to roil both the plaintiff and defendant camps in the ongoing controversy over AI development. Plaintiffs who haven’t made a piracy claim in their lawsuits may by prompted to add it. Defendants will come under greater pressure to forestall lawsuits by scurrying to reach licensing deals with writers, musicians and artists.

That will happen especially if another judge accepts Alsup’s argument about piracy. “That may well encourage other lawsuits,” Lee says.For Anthropic, the challenge will be “trying to convince a jury that the award of damages should be $750 per work,” Lee says. Alsup’s ruling makes this case one of the rare lawsuits in which “the plaintiffs have the upper hand,” now that they have won class certification.

“All these companies will have great pressure to negotiate settlements with plaintiffs; otherwise, they’re at the mercy of the jury, and you can’t bank on anything in terms of what a jury might do.” More to Read

Analysis

Phenomenon+
Conflict+
Background+
Impact+
Future+

Related Podcasts