What is it about?

As much as 20% of ChatGPT responses can be incorrect. In this study, we first survey software industry professionals to understand how they are using ChatGPT and tackling the reliability issues. Then we developed a tool named ChatGPT incorrectness detector (CID) inspired by the survey and criminal psychology. In an evaluation study of software reviews, CID could detect incorrect responses with 75% accuracy.

Featured Image

Why is it important?

This is the first approach that could detect the incorrect responses of ChatGPT based on regular Chat or API interfaces. As the use of ChatGPT is growing, the developed technique can be used by any regular user or by researchers to improve ChatGPT trustworthiness.

Perspectives

This can be extended by researchers for improving ChatGPT or any such other large language models (like Google BARD). Users will also be able use the similar technique.

Minaoar Hossain Tanzil

Read the Original

This page is a summary of: ChatGPT Incorrectness Detection in Software Reviews, April 2024, ACM (Association for Computing Machinery),
DOI: 10.1145/3597503.3639194.
You can read the full text:

Read

Resources

Contributors

The following have contributed to this page