A group of high-profile authors has launched a lawsuit against Microsoft, alleging the tech giant used nearly 200,000 pirated books to train its Megatron artificial intelligence model. This legal challenge represents a significant pushback from the creative community against the perceived unauthorized appropriation of their work by AI companies. The authors contend that the AI was explicitly designed to mirror the “syntax, voice, and themes” of their copyrighted literary works.
The complaint, filed in New York federal court, demands a court order to stop Microsoft’s alleged copyright violations and seeks substantial statutory damages, potentially reaching $150,000 for each individual work purportedly misused. The plaintiffs emphasize that generative AI, which creates text, music, and images, relies intrinsically on vast databases to learn and subsequently produce content mirroring its training data. They specifically allege the pirated dataset was crucial for this mimicry.
No immediate response has been forthcoming from Microsoft representatives, and the authors’ legal counsel has refrained from commenting. This legal development aligns with a series of recent high-profile copyright rulings concerning AI, including judgments involving Anthropic and Meta in California, highlighting the ongoing legal uncertainties in the AI landscape.
The legal battlefront concerning AI and copyright is rapidly expanding, encompassing diverse forms of media. Prominent examples include The New York Times’ lawsuit against OpenAI, Dow Jones’ case against Perplexity AI, and actions by major record labels against AI music generators. Tech companies often defend themselves by citing the fair use doctrine, asserting that their AI creates novel, transformative works and that stringent copyright enforcement could impede the growth of the burgeoning AI industry.