A New York federal judge recently dismissed a copyright infringement lawsuit filed by Raw Story and AlterNet against OpenAI. The plaintiffs argued that OpenAI used their news articles without permission to train its models, resulting in harm to their businesses. However, the court found that the plaintiffs did not demonstrate specific economic harm, such as a reduction in revenue or subscriber numbers, that could be directly linked to OpenAI’s actions. This requirement for clear, measurable harm proved to be a critical factor in the case’s dismissal. The court underscored that while copyright law protects against unauthorized copying and distribution, it does not currently extend to training on public data unless the plaintiffs can show a direct economic impact.

U.S. District Judge Colleen McMahon said that the outlets could not show enough harm to support the lawsuit but allowed them to file a new complaint, even though she said she was “skeptical” that they could “allege a cognizable injury.”

Also at issue was the definition of “copying”. Plaintiffs could not demonstrate that transcripts of the material in question were being stored directly in the AI databases, as contrasted to merely being scanned and analyzed.

Authors Guild’s Class-Action Suit Reflects Similar Legal Hurdles

Similarly, a class-action lawsuit filed by the Authors Guild on behalf of a group of writers including Sarah Silverman and George R. R. Martin faced dismissal. The writers alleged that OpenAI used their books as training material without permission, infringing on their rights and affecting their livelihoods. The court, however, determined that without either evidence of concrete harm or actual infringement where the AI’s were reproducing passages verbatim, the plaintiffs could not proceed with their claims. This decision underlines a recurring theme in recent cases: courts are requiring proof of both tangible harm that goes beyond the use of publicly available content, and proof of actual infringement.

Artists’ Copyright Claims Against AI Image Generators Face Dismissal

Generative AI companies in the imaging field, such as Stability AI and Midjourney, are also facing lawsuits from artists who contend that their copyrighted work was used in training data without consent. The plaintiffs in this case argued that the AI models incorporated their art to produce new images, which they see as a form of infringement. However, in October 2023, a federal judge dismissed most of these claims, again citing the challenge of demonstrating harm. The court’s decision highlighted that copyright law does not explicitly address the use of publicly accessible content for AI training, creating a hurdle for plaintiffs in similar cases.

The Question of “Training Rights” in Copyright Law

Across these cases, courts have consistently noted that copyright law, as currently written, does not directly address the use of publicly available content for training purposes. Historically, copyright laws have governed against unauthorized reproduction, distribution, and derivative works, but they do not specifically regulate how publicly available information can be used for machine learning and AI training. Without a clearly defined right over how content can be utilized in AI training, courts have pointed to the difficulty of pursuing these claims under existing law.

Implications for Music and Production Industries

The recent rulings also highlight implications for the music industry, where similar concerns have been raised. The Recording Industry Association of America (RIAA) and other music groups have argued that training AI models on copyrighted music could lead to reduced revenue streams for artists. For instance, if AI-generated music were to compete with human-created tracks on streaming platforms, artists might see a decline in royalty earnings. However, as seen in the recent cases, proving actual harm from AI training practices remains a legal hurdle.

The production music and sync licensing industry faces similar challenges. If companies begin using AI models trained on extensive music catalogs to generate production-ready music at a low cost, it could disrupt traditional licensing models. However, as with other cases, the absence of a legal precedent granting creators control over how publicly accessible content is used in AI training complicates their ability to enforce restrictions.

Moving Forward: Potential Legal Reforms

These recent dismissals illustrate how current copyright law may need adjustments to address the unique complexities of AI and machine learning. Courts have signaled that, while creators have valid concerns, legal protections specific to training data use are not yet well-defined. Without a legal framework that acknowledges “training rights,” plaintiffs face an uphill battle in demonstrating their claims in cases involving AI. New legislation addressing the use of copyrighted material in AI training could provide clearer protections for content creators while setting boundaries for AI companies.

For now, these rulings indicate that plaintiffs must demonstrate measurable, specific harm in copyright cases involving AI. Until clearer legal guidelines emerge, AI companies are likely to continue training their models on publicly available data within the boundaries of current copyright law, while creators may increasingly look to lawmakers for updated protections that address this evolving area of intellectual property rights.

Gene Turnbow

President of Krypton Media Group, Inc., radio personality and station manager of SCIFI.radio. Part writer, part animator, part musician, part illustrator, part programmer, part entrepreneur - all geek.