Encyclopedia Britannica and Merriam-Webster Sue OpenAI Over AI Training Data

Encyclopedia Britannica and its subsidiary dictionary publisher, Merriam-Webster, have officially filed a lawsuit against artificial intelligence giant OpenAI. The legal action accuses the AI company of massive copyright infringement for allegedly using the publishers’ proprietary reference materials to train its popular language models without permission. This high-profile case highlights growing tensions between traditional content creators and technology companies over the unauthorized use of intellectual property in the artificial intelligence era.

Contents

Allegations of Massive Copyright Infringement The Financial Impact and Loss of Web Traffic Trademark Violations and Artificial Intelligence Hallucinations Previous Legal Actions and Industry Context The Evolution of the Britannica Group

The complaint was filed on Friday, March 13, 2026, in the United States District Court for the Southern District of New York, located in Manhattan. According to the court filings, Britannica and Merriam-Webster claim that their digital content was unlawfully utilized as training data for OpenAI’s flagship chatbot, ChatGPT, and its underlying GPT large language models.

Allegations of Massive Copyright Infringement

At the center of the lawsuit is the allegation that OpenAI duplicated nearly 100,000 online articles, encyclopedia entries, and dictionary definitions. The publishers assert that these materials, which include both free and subscription-based reference content, were scraped from their platforms and fed into OpenAI’s systems without any authorization or licensing agreements.

The lawsuit claims that because of this extensive training, ChatGPT now produces narrative responses that contain verbatim or near-verbatim reproductions of the original copyrighted works. The complaint explicitly states that the AI models have memorized large portions of Britannica’s content. As a result, when users ask the chatbot for information, it often generates direct summaries, abridgments, or exact copies of the publishers’ hard-earned definitions and historical entries.

Furthermore, the legal document acknowledges that the complete extent of this copying remains unknown to the public and is currently only known to OpenAI itself. The publishers also accuse the AI laboratory of misusing their articles in ChatGPT’s retrieval-augmented generation workflow. This specific tool allows the language model to scan the internet and external databases for newly updated information to answer user queries in real time.

The Financial Impact and Loss of Web Traffic

For content creators like Britannica and Merriam-Webster, web traffic is a crucial component of their business model. The lawsuit argues that OpenAI’s actions cause significant material detriment by directly competing with their official websites.

By providing users with exact or near-exact copies of encyclopedia and dictionary entries, ChatGPT essentially eliminates the need for individuals to visit the original sources. The publishers argue that this behavior redirects internet users who would have otherwise visited the Britannica or Merriam-Webster platforms. By keeping users within the ChatGPT interface, OpenAI allegedly steers valuable traffic away from the publishers, threatening their digital revenue streams.

Trademark Violations and Artificial Intelligence Hallucinations

Beyond copyright infringement under the Copyright Act of 1976, the lawsuit is also built on a second major legal pillar: trademark violations under the Lanham Act. The plaintiffs argue that OpenAI is not only stealing their content but also damaging their historic reputations.

According to the complaint, ChatGPT sometimes generates false or fabricated information, a phenomenon widely known in the tech industry as an AI hallucination. The lawsuit claims that the chatbot occasionally produces these made-up facts and then falsely attributes them to Encyclopedia Britannica or Merriam-Webster. The publishers argue that attaching their highly respected brand names to inaccurate or completely invented information violates trademark laws and undermines the public trust they have spent centuries building.

Previous Legal Actions and Industry Context

OpenAI, which is co-founded by CEO Sam Altman, has not yet issued a public comment regarding this specific lawsuit. However, across the broader artificial intelligence industry, tech firms have consistently defended their practices. Companies developing large language models generally argue that training their systems on publicly available internet data falls under the legal doctrine of fair use, claiming that they modify the original materials into completely new, transformative tools.

This legal battle is not the first time the Britannica Group has taken a stand against artificial intelligence developers. The current lawsuit closely mirrors a previous legal action the publishers launched just six months ago. In September 2025, Britannica and Merriam-Webster sued the AI-powered search engine startup Perplexity in a New York federal court. That previous lawsuit featured nearly identical grounds, accusing Perplexity of systematically scraping their websites, unlawfully copying articles, stealing internet traffic, and linking the publishers’ names to fabricated information.

The Evolution of the Britannica Group

The Britannica Group, headquartered in Chicago, has a long history of adapting to technological shifts. The company has been producing the Encyclopedia Britannica for 250 years. Recognizing the decline of physical reference books, the organization officially ceased its famous print publication in 2012.

Since then, the publisher has successfully transitioned its business model to focus heavily on digital expansion, educational software, and even the sale of AI agent software. Merriam-Webster has followed a similar path, establishing itself as a leading digital provider of dictionary services. Despite their successful embrace of digital platforms, the companies maintain that their valuable intellectual property must be protected from unauthorized exploitation by generative AI companies.

Category

Company

Resources

Encyclopedia Britannica and Merriam-Webster Sue OpenAI Over AI Training Data

Allegations of Massive Copyright Infringement

The Financial Impact and Loss of Web Traffic

Trademark Violations and Artificial Intelligence Hallucinations

Previous Legal Actions and Industry Context

The Evolution of the Britannica Group

Leave a Reply Cancel reply

Most Read

Rosalind Franklin Rover Set for SpaceX Mars Launch

Nvidia-LG Talks Highlight Wider AI Expansion Strategy

Rio Doce Mining Disaster Contaminates Local Bananas

March 2026 Total Lunar Eclipse: Timings and Global View

OpenAI Pushes “Just Build Things” in Super Bowl 2026 Spot as Rivalry Heats Up

Nuclear Clock Breakthrough: A New Era of Timekeeping

About Us

Explore

Useful Links

Subscribe Us