Anthropic Hit With Copyright Suit From Authors Over Flagship AI

Aug. 20, 2024, 3:22 PM UTC

Anthropic PBC is facing another copyright lawsuit over the training of its generative AI models, this time from a group of authors who say the company used illegally downloaded versions of their works and tried to hide the extent of the infringement.

The authors bringing the proposed class action—Andrea Bartz, Charles Graeber, and Kirk Wallace Johnson—said Amazon.com Inc.-backed Anthropic has publicly said it used an open-source dataset called The Pile to train its AI “Claude.” The Pile contained within it another dataset called “Books3,” known to contain nearly 200,000 pirated books, including some written by the named plaintiffs, according to the complaint filed in the US District Court for the Northern District of California.

While the Books3 dataset was removed from The Pile in August 2023, old versions still containing the pirated books are still available, the complaint said. The authors said “it is apparent” from news articles and announcements that Anthropic trained its AI on content from that dataset—a “modern-day Napster"—instead licensing the content.

“It is not consistent with core human values or the public benefit to download hundreds of thousands of books from a known illegal source,” the complaint said. “Anthropic has attempted to steal the fire of Prometheus.”

Anthropic’s models “compromise authors’ ability to make a living” by allowing users to generate text the writers would otherwise be paid to create and sell, the authors said. They argued the tech company’s actions have “usurped a licensing market for copyright owners,” referencing other AI companies, including OpenAI Inc., Google, and Meta Platforms Inc., that have struck up deals with content owners such as Axel Springer SE and News Corp.

“In the last two years, a thriving licensing market for copyrighted training data has developed,” the authors wrote.

Anthropic is also facing another lawsuit from eight music publishers, who say the company’s AI chatbot produced verbatim song lyrics scraped from the internet, in the same court.

The complaint said that while Anthropic “styles itself as a public benefit company, designed to improve humanity,” it has “wrought mass destruction,” for copyright owners.

Anthropic didn’t immediately respond to a request for comment.

The authors are represented by Susman Godfrey LLP, Lieff Cabraser Heimann & Bernstein LLP, and Cowan Debaets Abrahams & Sheppard LLP.

Counsel for Anthropic hasn’t yet entered an appearance in this case. Latham & Watkins LLP represents the company in the music publishers’ lawsuit.

The case is: Bartz et al v. Anthropic PBC, N.D. Cal., 3:24-cv-05417, complaint filed 8/19/24.

To contact the reporter on this story: Aruni Soni in Washington at asoni@bloombergindustry.com

To contact the editors responsible for this story: Adam M. Taylor at ataylor@bloombergindustry.com; James Arkin at jarkin@bloombergindustry.com

Learn more about Bloomberg Law or Log In to keep reading:

See Breaking News in Context

Bloomberg Law provides trusted coverage of current events enhanced with legal analysis.

Already a subscriber?

Log in to keep reading or access research tools and resources.