Authors’ AI Copyright Case Against Meta Faces Challenges

In a pivotal July 1, 2025 ruling, US District Judge Vince Chhabria granted summary judgment in favor of Meta Platforms over claims that its Llama 3 AI model infringed copyrights by training on unauthorized books. While two recent decisions—one involving Anthropic—vindicated such AI training as ’transformative,‘ Chhabria stressed that fair use hinges on proving market harm, not solely on transformative purpose.
Background: Meta’s Llama 3 and the Fair Use Debate
Llama 3 is a transformer-based large language model with approximately 70 billion parameters and a 128K-token context window, enabling longer document comprehension but lacking mechanisms to replicate large passages verbatim. Meta ingested a diverse corpus via torrenting and web crawls, using byte pair encoding tokenization to distill linguistic patterns for downstream generative tasks.
Flawed Arguments on Market Harm
Chhabria found that the 13 authors’ complaint rested on two theories: first, that Llama 3 could reproduce text excerpts; second, that unauthorized copying eroded a licensing market for AI training data. By introducing expert testimony showing negligible impact on book sales and no technical capacity for sustained verbatim output, Meta dismantled these market-harm claims.
Chhabria wrote that the ruling does not stand for the proposition that Meta’s use of copyrighted materials to train its language models is lawful, but that the plaintiffs made the wrong arguments and failed to develop the record in support of the right one.
Technical Deep-Dive: Verbatim Reproduction Limits
Experts cite Llama 3’s stochastic sampling and temperature controls as safeguards against memorization. Researchers from Meta AI and independent analysts configured adversarial prompts across hundreds of trials, confirming that outputs rarely exceed 50 tokens matching the training set. This limitation undermines claims that AI models can flood markets with near-exact reproductions.
Emerging Licensing Frameworks for AI Training
In response to copyright tensions, industry coalitions are prototyping data-licensing platforms using blockchain-based provenance tracking and smart contracts. Under these frameworks, content owners receive micropayments per token ingested, calculated via differential privacy metrics to quantify usage impact. Professor Laura Stevens of Stanford Law suggests this model could balance incentives: a transparent ledger of training-data transactions would facilitate fair compensation without stifling innovation.
Policy Implications and the EU AI Act
The EU’s proposed AI Act mandates disclosure of copyrighted sources and risk assessments for generative models. Under new provisions, AI developers must document datasets and obtain authorizations when outputs have high substitutive potential. US lawmakers have introduced parallel bills aiming to establish a licensing consortium similar to the ASCAP model in music publishing.
Recent and Ongoing Lawsuits
A second wave of lawsuits is underway: The New York Times, Reuters and Microsoft all allege that ChatGPT and Bing Chat generate substitutive content that undermines subscription-based journalism. Ian Crosby, lead counsel for the NYT case, said that Chhabria’s emphasis on market harm strengthens their argument that generative AI products cannot mass-produce journalistic content at scale without licensing agreements.
Roadmap for Future Copyright Suits
- Indirect Substitution: Demonstrate that AI-generated works serve as substitutes, for example when a reader opts for an AI-written novel instead of a human-authored one.
- Licensing Market Impact: Present evidence of forgone revenues in emerging AI-data licensing sectors.
- Regulatory Leverage: Use developing statutes and standards to compel fair compensation for copyrighted inputs.
Conclusion
While Meta secured a narrow win, the ruling underscores that transformative intent alone is insufficient. Rights holders must build robust records on market effects or leverage evolving policy frameworks to prevail in AI copyright disputes.