What is it about?

ChatGPT and other AI systems are often evaluated as if they are humans - for example, by seeing how well they perform on standardized tests such as the SAT. An issue with this approach is that AI systems are not humans, so the tests that are most illuminating about them might differ from the tests that are most illuminating about humans. We introduce a different perspective for understanding these systems: reasoning about the pressures that have shaped them. This perspective reveals some surprising limitations of AI systems on seemingly simple tasks, such as counting the words in a list or decoding simple ciphers.

Featured Image

Why is it important?

Large language models (LLMs) are becoming increasingly widely adopted in a diverse range of fields, such as education, law, and cognitive science. It is therefore increasingly important to understand the strengths and limitations of these systems so that we can recognize when they can be trusted. Our results have practical implications for when language models can safely be used, and the approach that we introduce provides a broadly useful perspective for reasoning about AI.

Read the Original

This page is a summary of: Embers of autoregression show how large language models are shaped by the problem they are trained to solve, Proceedings of the National Academy of Sciences, October 2024, Proceedings of the National Academy of Sciences,
DOI: 10.1073/pnas.2322420121.
You can read the full text:

Read

Contributors

The following have contributed to this page