Another legal dispute for OpenAI and Microsoft, sued by the New York Times for using its news in order to train its ChatGPT chatbot without an agreement to use intellectual property. The lawsuit was filed in a federal district court in Manhattan and follows similar ones, such as that of some very important international writers including George Martin.
However, this is the first time that a major publishing company has openly spoken out against ChatGPT developers for copyright infringement. The issue is mainly of an economic nature, as the NYT believes that OpenAI has generated significant profits, which translate into parallel financial damage for the newspaper, which thus aims to be adequately compensated.
The NYT claims that OpenAI and Microsoft took advantage of the huge investment the New York Times has made in journalism for free, without having any licensing agreements. In part of the complaint, the NYT specifies that its domain (www.nytimes.com) represented the most widely used proprietary source for extracting content for GPT-3 training purposes.
Specifically, we are talking about more than 66 million records used, spanning almost a century of copyrighted publications. According to the NYT, OpenAI and Microsoft’s products can “generate output that recites NYT texts word-for-word, summarizes them accurately, and mimics their expressive style.”
There was no lack of an initial response from OpenAI, which through a spokesperson made a statement to Engadget, which we report partially.
“We respect the rights of content creators and owners, and we are committed to working with them to ensure they benefit from AI technology and new revenue models. We are surprised and disappointed by this development.”
OpenAI is hopeful that a win-win solution can be found, but if the lawsuit goes ahead, it could create fertile ground for other similar lawsuits, which would in fact lead to increased costs in training AI models for commercial purposes. Apparently other news organizations, such as CNN and BBC News, have already tried to limit the data that AI web crawlers can extract for training and development purposes, but it is not clear with what results.
In other precedents, OpenAI has agreed to pay to access the use of protected content, as was the case with publisher Axel Springer, but with the NYT, the issue is more nebulous. It is not clear whether the newspaper is open to a licensing agreement, and above all it remains to be seen what the size of the reimbursement requested could be. Meanwhile, OpenAI has entered into a three-year agreement to use Politico and Business Insider articles for training purposes.