Cleary Gottlieb Discusses Roadblocks for Plaintiffs in Generative Artificial Intelligence Lawsuit

On October 30, 2023, U.S. District Judge William Orrick of the Northern District of California issued an Order[1]largely dismissing without prejudice the claims brought by artists Sarah Andersen, Kelly McKernan and Karla Ortiz in a proposed class action lawsuit against artificial intelligence (“AI”) companies Stability AI, Inc., Stability AI Ltd. (together, “Stability AI”), DeviantArt, Inc. (“DeviantArt”) and Midjourney, Inc. (“Midjourney”). Andersen is the first of many cases brought by high-profile artists, programmers and authors (including John Grisham, Sarah Silverman and Michael Chabon) seeking to challenge the legality of using copyrighted material for training AI models.

Background

Stability AI, DeviantArt and Midjourney all provide tools and services that use AI to generate new images based on text prompts offered by consumers. The Complaint alleges that the Large-Scale Artificial Intelligence Network (“LAION”), a German non-profit which makes open-sourced datasets, scraped over five billion images from the internet (including works by the plaintiffs) at the behest of Stability AI.[2] The Complaint further alleges that (i) Stability AI used this LAION dataset for the purpose of training its Stable Diffusion software product, (ii) DeviantArt and Midjourney incorporated Stable Diffusion into their products and (iii) Midjourney’s AI product was also “trained on a subset of the images used to train Stable Diffusion” (suggesting that Midjourney conducted its own training using the LAION dataset).[3] Claims brought against the defendants included direct copyright infringement, vicarious copyright infringement, violations of the Digital Millennium Copyright Act, violations of California’s right of publicity statute and common law rights of publicity and unfair competition under California state law. All three defendants filed separate motions to dismiss, and DeviantArt filed a separate anti-SLAPP motion to strike the right of publicity claims, in which both Midjourney and Stability AI joined.

In the first instance, Judge Orrick dismissed Ortiz’s and McKernan’s copyright claims with prejudice because neither of them had fulfilled the requisite registration requirements to bring suit by registering their images with the Copyright Office, and further narrowed Andersen’s copyright claims to only those images in Andersen’s collection which had been registered with the Copyright Office at the time the suit was filed.[4] The court also dismissed with prejudice plaintiffs’ claims for unfair competition, finding those claims are preempted by the U.S. Copyright Act to the extent they are based on alleged copyright infringement. Judge Orrick granted leave to amend as to all other claims, but warned “I will not be as generous with leave to amend on the next, expected rounds of motions to dismiss and I will expect a greater level of specificity as to each claim alleged and the conduct of each defendant in support of each claim.” Below are some key takeaways from the Order.

Key Takeaways

Plaintiffs must prove actual unauthorized reproduction; mere usage of or reliance on an already trained model does not suffice for direct copyright infringement based on AI training. The direct infringement claims against DeviantArt and Midjourney were dismissed for failure to allege facts showing that DeviantArt and Midjourney had, themselves, reproduced copyrighted images in training their models; all that plaintiffs alleged was that DeviantArt and Midjourney wereusing Stable Diffusion to provide their own services.[5] The only direct infringement claim to survive the defendants’ motions to dismiss is that brought against Stability AI based on its alleged copying and use of images from the LAION dataset to train Stable Diffusion. Although Stability AI opposed plaintiffs’ assertions, Judge Orrick found that fact issues as to how Stability AI was trained, and whether copies were made, could not be resolved on a motion to dismiss.[6]
AI outputs are not “derivative works” unless they are substantially similar to third-party copyrighted content. Judge Orrick rejected plaintiffs’ argument that every output image from these generative AI tools was necessarily a derivative work of the input data, given the implausibility that all training data is actually copyrighted or that all output actually relied on copyrighted training data.[7] Even so limited, Judge Orrick noted that a plaintiff must still show substantial similarity between the training data and a specific challenged output.[8]
Skepticism that the AI model itself could be a “derivative work.” Judge Orrick expressed confusion regarding plaintiffs’ arguments that Stable Diffusion itself was a derivative work because it allegedly stored “compressed copies” of the copyrighted images it was trained on, and instructed plaintiffs to (i) clarify their theory as to how Stable Diffusion operates with respect to the training images, (ii) define “compressed copies” and (iii) explain plausible facts in support.[9] In their motions to dismiss, defendants argued that their models—which are comprised of data and algorithms rather than five billion “compressed copies” of works—could not plausibly be described as substantially similar in protected expression to any alleged copyrighted work on which they were trained, as necessary to assert a claim for violation of the derivative works right.
General claims regarding vicarious infringement or violation of DMCA and right of publicity do not suffice; plaintiffs must allege plausible facts to support each claim. The court dismissed all claims for vicarious copyright infringement and violation of the DMCA and right of publicity with leave to amend. The vicarious infringement claims against DeviantArt and Midjourney were dismissed for failure to adequately plead any act of direct infringement.[10] As to Stability AI, Judge Orrick reiterated that the claim required further clarity as to how Stable Diffusion supposedly stored “compressed copies” and made them available to alleged direct infringers (other platforms or users). The court rejected any vicarious liability premised on the theory that all output is necessarily infringing, noting plaintiffs’ own allegations that no output is likely to be “a close match for any specific image in the training data.”[11] Similarly, Judge Orrick made clear that plaintiffs must allege plausible facts regarding what type of copyright management information was allegedly altered or removed from their works in violation of the DMCA, and which defendants did such removing or altering. And Judge Orrick rejected the adequacy of the right of publicity claims, noting that the Complaint failed to allege “any facts specific to the three named plaintiffs to plausibly allege that any defendant has used a named plaintiff’s name to advertise, sell, or solicit purchase of DreamStudio, DreamUp or the Midjourney product.”
Asserting facts plausibly establishing that copyrighted works are included in training data is sufficient to sustain a motion to dismiss. Judge Orrick evaluated defendants’ argument that Andersen’s claims must fail because she has not identified which of her registered works were used as “training images” for Stable Diffusion, and concluded that it is sufficient, at the pleading stage, for Andersen to plead ownership of valid copyright registrations and to rely on the output of a search of her name on “haveibeentrained.com”[12] to support the plausibility and reasonableness of her belief that her copyrighted works were in the LAION dataset used for training Stable Diffusion.[13] Whether she will ultimately be able to prove this fact, and whether such training or the outputs resulting therefrom will be found infringing or fair use, remains to be resolved later in the case.

ENDNOTES

[1] Order on Motions to Dismiss and Strike (the “Order”), Andersen v. Stability AI Ltd. et al, 3:23-cv-00201 (N.D. Cal. Oct. 30, 2023).

[2] Complaint at 57, 101, 104, Andersen, 3:23-cv-00201.

[3] Id. at *34-35, 57, 62, 101, 104, 115, 134, 148-49.

[4] Order at *5.

[5] Id. at *8-14.

[6] Id. at *7.

[7] Id. at *10-12.

[8] Id. at *11-12.

[9] Id. at *9.

[10] Id.

[11] Id. at *15-16.

[12] HaveIbeentrained.com is a website created by a group of artists who call themselves “Spawning,” which allows individuals to check whether their art or photos have been used as part of the training data for text-to-image AI tools.

[13] Id. at *6-7.

This post comes to us from Cleary Gottlieb Steen & Hamilton LLP. It is based on the firm’s memorandum, “SIGNIFICANT ROADBLOCKS FOR PLAINTIFFS IN GENERATIVE ARTIFICIAL INTELLIGENCE LAWSUIT: California Judge Dismisses Most Claims Against AI Developers in Andersen v. Stability AI,” dated November 2, 2023, and available here. Cleary Gottlieb represents Midjourney in this matter.

The CLS Blue Sky Blog

Columbia Law School's Blog on Corporations and the Capital Markets

Cleary Gottlieb Discusses Roadblocks for Plaintiffs in Generative Artificial Intelligence Lawsuit

Background

Key Takeaways