Study shows that ChatGPT has the most copywritten data compared to other top LLMs

Home » News

2 min. read

Published on March 6, 2024

by Devesh Beri

published on March 6, 2024

Share this article

Improve this guide

Readers help support MSpoweruser. We may get a commission if you buy through our links.

Key notes

AI models like GPT-4 and Claude 2 were found to generate text containing copyrighted material.
OpenAI’s GPT-4 was the least cautious, potentially infringing copyrights in 44% of prompts tested.

A new study by Patronus AI, a company specializing in evaluating large language models (LLMs), has sparked concerns about copyright infringement and the use of copyrighted data in training AI models. The research, released on Wednesday, tested four AI models: OpenAI’s GPT-4, Anthropic’s Claude 2, Meta’s Llama 2, and Mistral AI’s Mixtral. Surprised that they missed out on the Gemini

Patronus AI employed their newly revealed “CopyrightCatcher” to analyze the models’ responses to prompts related to popular copyrighted books. The challenge was simple: the prompts challenged the models to either complete a book passage or provide the first passage of a specific book.

All four AI models produced content with copyrighted material to some degree.

OpenAI’s GPT-4 produced the highest number of prompts (44%) with copyrighted text.
Anthropic’s Claude 2 was the most cautious, generating copyrighted content in only 16% of the completion prompts. It also refused to answer all first-passage prompts, citing its lack of access to copyrighted materials. (Claude 3 was recently released, and Anhtropic is confident that it’s better than other LLMs)
Meta’s Llama 2 produced copyrighted content in 10% of the prompts.
Mistral’s Mixtral displayed a higher tendency to complete first passages (38%) than larger text chunks (6%).

Patronus AI’s findings call for proactive steps to address copyright concerns and promote responsible and ethical practices for innovation to thrive. It would’ve been better to add Gemini to the test as well.

Devesh Beri

Tech Journalist

These are the things that motivate me - creating informative and helpful content, pursuing my passion for motorsports and music, engaging in expeditions, maintaining a healthy lifestyle, and spending time with my adorable cat Taco.

User forum

0 messages

Sort by:

Leave a Reply Cancel reply