The Governance Implications of an Important Case About AI and Fair Use

The recent federal court ruling in Bartz v. Anthropic PBC has significantly shifted the legal terrain for corporate governance and artificial intelligence. While the case directly addresses copyright issues, it has implications boards of directors, compliance departments, and AI policy.

At the core of the Bartz case lies a deceptively simple question: Can training artificial intelligence systems like Claude on copyrighted books qualify as fair use? The court held that it might under certain conditions, but then drew a sharp line between such training and the maintenance of a permanent digital repository of pirated books, a practice that the court found was not protected by fair use. The decision will likely influence future litigation involving generative AI and copyright law. This boundary-setting has material consequences for companies that develop or deploy LLMs trained on third-party content.

The court’s analysis adhered to the familiar four-factor test of 17 U.S.C. Section 107. According to the judge, the first factor, the purpose and character of the use, was satisfied, as training large language models (LLMs) on text was “exceedingly transformative.” However, the court rejected Anthropic’s contention that its accumulation of a static library of pirated books was similarly protected. This distinction between transformative use for AI training and non-transformative data warehousing is doctrinally significant. It highlights the evolving boundary between innovation and appropriation in the context of machine learning. Not all uses by AI companies are equally defensible; some may advance knowledge and expressive freedom, while others amount to little more than theft. For companies investing in generative AI, this signals that the details of data governance matter. Boards and executives will need to ask: How was our model trained? What controls exist to manage downstream copyright risk? Are our data sources properly licensed?

In a prior article, we examined the murky intersection between copyright and AI-generated outputs. Current U.S. copyright law recognizes only “original works of authorship” by human creators. This creates fundamental uncertainty when dealing with machine-generated text, which may not qualify for copyright protection, or worse, may infringe on protected material. The Bartz ruling highlights that this debate is no longer theoretical. It is happening in courtrooms, and its outcome will determine both compliance strategies and innovation trajectories.

Beyond the authorship debate, we engaged with the risk of copyright infringement posed by AI-generated text. We noted that while some AI-generated content is derived from public domain material, many of the texts ingested in training remain under copyright. The creation of these datasets reflects the collective labor of countless individuals, often incorporated without consent or compensation. Though not always amounting to literal plagiarism, LLMs may produce outputs that borrow or rephrase protected material, exposing developers to claims of infringement.

Our article underscores that U.S. law remains unsettled. While courts like the one in Bartz recognize some AI training as fair use, others have dismissed infringement claims for failing to show substantial similarity between outputs and original works. Still, litigation is ongoing, and the broader doctrinal picture is far from clear. Thus, the questions raised in Bartz echo well beyond that case: Does the ingestion of copyrighted material by LLMs violate copyright law? Is the resulting output protected, infringing, or something in between? And as AI systems become more adept at mimicking human style and structure, should we recalibrate our understanding of creativity, originality, and fair use?

Legal Risk and Ethical AI: Next Steps for Boards

So what does this mean for governance? As generative AI technologies become increasingly sophisticated and commercially viable, governance practices must evolve in tandem. First, as to risk management, boards should assess legal exposure tied to training datasets and licensing practices. Second, companies should create policies for ethical AI development, particularly around content ingestion. Third, publicly traded firms may need to disclose AI-related IP risks in securities filings. Finally, investors and audit committees should evaluate whether AI initiatives comply with evolving copyright norms.

We believe that the Bartz case is not the end of the story. The case offers a temporary equilibrium, a jurisprudential stopgap that invites both celebration and scrutiny. But the clock is ticking. As generative AI continues to evolve, so must our legal frameworks. The question is not just what the law is, but what it ought to become.

This post comes to us from Professor Hadar Y. Jabotinsky, the founder and head of the Hadar Jabotinsky Center for Interdisciplinary Research of Financial Markets, Crises and Technology, and Professor Michal Lavi senior researcher at the Hadar Jabotinsky Center for Interdisciplinary Research of Financial Markets, Crises and Technology,

The CLS Blue Sky Blog

Columbia Law School's Blog on Corporations and the Capital Markets

The Governance Implications of an Important Case About AI and Fair Use

Leave a Reply Cancel reply