OpenAI finds GPT-4 human reviewers aided by CriticGPT outperform non-AI counterparts

OpenAI is open for criticism

Home » News

2 min. read

Published on June 28, 2024

by Rafly Gilang

published on June 28, 2024

Share this article

Improve this guide

Readers help support MSpoweruser. We may get a commission if you buy through our links.

Key notes

OpenAI’s new CriticGPT, based on GPT-4, critiques ChatGPT’s code to assist human trainers.
It improves trainer performance by 60% compared to non-assisted reviews.
CriticGPT critiques are preferred 63% of the time over ChatGPT’s due to fewer nitpicks and hallucinations.

Not too long after releasing the ChatGPT desktop app on macOS, OpenAI has just launched yet another model. It’s called CriticGPT, based on GPT-4, and it lets you identify and critique errors in the popular AI chatbot’s code outputs to help human trainers during feedback.

The Microsoft-backed company explains that CriticGPT-assisted human trainers were able to outperform their unassisted counterparts by 60%. But, still, despite the reduction of hallucinated issues, CriticGPT still needs some criticism, especially when handling complex tasks and dispersed errors.

An AI sure does know how to automate itself, but human reviewers are still needed, that’s why even Google still explicitly says that they’re using human reviewers to review how AI is used in the browsing history section of Chrome.

So, similar to how ChatGPT is trained, CriticGPT also learns through human feedback, focusing on spotting errors deliberately inserted into code generated by ChatGPT. AI trainers then evaluated CriticGPT’s ability to find these intentional errors and naturally occurring bugs caught by other trainers.

The results showed that CriticGPT’s critiques were preferred over ChatGPT’s in 63% of cases for naturally occurring bugs, as it generated fewer unhelpful nitpicks and hallucinations.

“In our research on CriticGPT, we found that applying RLHF to GPT-4 has promise to help humans produce better RLHF data for GPT-4. We are planning to scale this work further and put it into practice,” OpenAI promises.

Rafly Gilang

Tech Reporter

Rafly is a reporter with years of journalistic experience, ranging from technology, business, social, and culture. Currently reporting news on Microsoft-related products, tech, and AI on MSPowerUser. Got a tip? Send it to [email protected]

User forum

0 messages

Sort by:

Leave a Reply Cancel reply