Computing

Meta faces lawsuit for training AI with pirated books

February 10, 2025

In a recent lawsuit, Meta has been accused of using pirated books to train its AI models, with CEO Mark Zuckerberg’s approval. As per Ars Technica, the lawsuit filed by authors including Ta-Nehisi Coates and Sarah Silverman in a California federal court, cite internal Meta communications indicating that the company utilized the Library Genesis (LibGen) dataset—a vast online repository known for hosting pirated books—despite internal concerns about the legality of using such material.

The authors argue that Meta’s actions infringe upon their copyrights and could undermine the company’s position with regulators. They claim that Meta’s AI models, including Llama, were trained using their works without permission, potentially harming their livelihoods. Meta has defended its practices by invoking the “fair use” doctrine, asserting that using publicly available materials to train AI tools is legal in certain cases, such as “using text to statistically model language and generate original expression.”

Unsealed court documents from February 5th, 2024, in Kadrey v. Meta show Meta (formerly Facebook) illegally torrented 81.7TB of data from “shadow libraries” such as Anna’s Archive, Z-Library, and LibGen to train Meta artificial intelligence.

Highlights include:
– A senior AI… pic.twitter.com/Bqf60Hhbb6

— vx-underground (@vxunderground) February 8, 2025

One internal message highlighted in the lawsuit quotes an employee expressing discomfort, stating, “Torrenting from a corporate laptop doesn’t feel right.”

In response to the lawsuit, U.S. District Judge Vince Chhabria dismissed some claims but allowed the authors to amend their complaint to include new allegations, including those related to the removal of copyright management information. This case is part of a broader wave of legal challenges against tech companies like Meta, OpenAI, and Anthropic, where authors and creators are seeking to protect their intellectual property rights in the face of rapidly advancing AI technologies.

The outcome of this lawsuit could have significant implications for the tech industry, particularly concerning the use of copyrighted materials in AI training. It raises important questions about the balance between technological innovation and the protection of creators’ rights.

Source link

RELATED ARTICLESMORE FROM AUTHOR

NBC resurrects legendary NBA voice Jim Fagan using AI

As Nvidia struggles with RTX 5060 Ti stock, AMD’s RX 9060 XT GPU could sweep in and score a mid-range win with plentiful supply

Quordle hints and answers for Saturday, May 10 (game #1202)

RELATED ARTICLES MORE FROM AUTHOR