the AI that perfects ChatGPT responses

It is now commonly accepted that AI systems can answer enormities to their users. Between the glue pizzas of AI Overview (Google), the embarrassing responses of Prometheus (Microsoft) and the false information that ChatGPT allows itself to produce at certain times, these are far from perfect. Even though these hallucinations are becoming less common, OpenAI decided to tackle the problem head on by developing an AI, CriticGPT, to fix ChatGPT. The snake that bites its tail ?

CriticGPT: a watchful eye on the code

This new system is obviously based on the same language model as ChatGPT-4, but is specialized to detect imperfections in the chatbot’s responses. It meticulously analyzes the lines of code and then reports potential errorsthus relieving the work of flesh-and-blood revisers.

This advance is part of a broader effort to better align AI systems with human expectations, including through reinforcement learning from human feedback. A recent study, titled LLM Critics Help Catch LLM Bugsreveals that CriticGPT was trained on a data set littered with intentional errorsthereby refining its ability to identify and report a myriad of programming bugs.

The results are telling: in 63% of cases involving natural errors in language models, the critiques formulated by CriticGPT were acclaimed by human evaluators, outperforming those generated by other AIs or even by human experts alone. A human-machine collaboration that seems to work wonderfully.

A wise expert, but still imperfect

CriticGPT goes even further. In advanced experiments, the model was confronted with a sample of ChatGPT training data, previously judged to be flawless by human experts. Against all expectations, CriticGPT detected anomalies in almost a quarter of cases, anomalies subsequently corroborated by the reviewers. Its skills therefore go beyond the field of code and CriticGPT can even identify subtle errors that would escape the eye of a human expert.

In their quest for excellence, the researchers have designed an innovative technique called Force Sampling Beam Search (FSBS). This ingenious method allows to precisely adjust the rigor of CriticGPT in its tracking of imperfections, while controlling the frequency of false positives. It is an algorithm that prefers explore less likely avenues to generate an answer rather than leaning toward the most obvious choice.

Despite the remarkable advances it offers, CriticGPT is not exempt from certain limitations inherent in its design. Indeed, his learning was mainly focused on the analysis of succinct responses generated by ChatGPT, which could prove insufficient to understand tasks of greater scope and complexity. Furthermore, although CriticGPT manages to significantly mitigate errors, he has not yet managed to eliminate them completely. So, human reviewers are still likely to make errors of judgment based on this sometimes erroneous data. Next step: creating a new language model to hunt down CriticGPT’s errors after its corrections to ChatGPT’s responses? Who knows!

CriticGPT is a new AI system designed to track down ChatGPT code errors.
Its use makes it possible to analyze and report errors in the responses produced by the chatbot which have escaped human correction.
Even if it proves effective, it remains imperfect and limited.

To not miss any Presse-citron news, follow us on Google News and WhatsApp.

CriticGPT: a watchful eye on the code

A wise expert, but still imperfect

Related posts