Zuckerberg Approved AI Training on Pirated Books, Filings Say

Jan. 9, 2025, 10:57 PM UTC

Meta Platforms Inc. CEO Mark Zuckerberg approved the tech giant’s use of a pirated book dataset to train its AI model LLaMA, a group of authors suing the company for copyright infringement alleged in unredacted court filings.

The factual discovery process during the litigation has also revealed that a Meta employee admitted to removing copyright information from the the LibGen dataset, a controversial “shadow library” containing millions of pirated books under copyright, in an effort to “conceal widespread copyright infringement,” the unredacted filings said.

The filings are part of a motion by the proposed class of authors, who include comedian ...

Learn more about Bloomberg Tax or Log In to keep reading:

Learn About Bloomberg Tax

From research to software to news, find what you need to stay ahead.

Already a subscriber?

Log in to keep reading or access research tools.