LLLMs: A Data-Driven Survey of Evolving Research on Limitations of Large Language Models

Aida Kostikova; Zhipin Wang; Deidamea Bajri; Ole Pütz; Benjamin Paaßen; Steffen Eger

doi:10.1145/3801096

What is it about?

Large language models (LLMs) are now used across many domains, but their weaknesses remain a serious concern: they hallucinate, show bias, struggle with long contexts, create security and privacy risks, and often fail on reasoning tasks. This survey studies how research on these limitations has changed since 2022. Instead of reviewing a manually selected set of studies in the traditional survey style, we take a corpus-scale, data-driven approach. From roughly 250,000 ACL Anthology and arXiv papers published between 2022 and early 2025, we identify 14,648 papers that substantially discuss LLM limitations and track how attention to these limitations changes over time. We find that both LLM research and research on LLM limitations have grown rapidly. Between 2022 and 2025, the share of LLM-related papers increases more than fivefold in ACL and nearly eightfold in arXiv. Limitations-focused research grows even faster and reaches over 30% of LLM papers by 2025. Reasoning remains the most studied limitation, while arXiv show increasing attention to security risks, alignment limitations, hallucinations, knowledge editing, and multimodality.

Photo by Zulfugar Karimov on Unsplash

Why is it important?

This study examines LLM limitations at corpus scale rather than through a small, manually selected set of papers. It introduces a validated method for large-scale analysis of research trends and shows how attention to different limitations changes over time, revealing patterns that traditional surveys can miss. This is important because LLM research is growing rapidly, and concerns about LLM failures now matter in both research and deployment. The study helps researchers and practitioners identify emerging concerns and understand how work on LLM limitations is changing.

Perspectives

My coauthors, the colleagues who helped shape the initial idea, and I, like many researchers in the field, were struck by the sheer volume of research on LLMs and their limitations, and wanted a clear overview that could be updated over time. We designed each part of the methodology carefully to build a rigorous pipeline for tracing how research on LLM limitations evolves. We hope this work will help others make sense of a fast-growing literature and clarify both current and future developments.
Aida Kostikova
Universitat Bielefeld

This page is a summary of: LLLMs: A Data-Driven Survey of Evolving Research on Limitations of Large Language Models, ACM Computing Surveys, March 2026, ACM (Association for Computing Machinery),
DOI: 10.1145/3801096.
You can read the full text:

Read

Resources

Data
Code and data for the LLM limitations survey
This repository for the code, data processing pipeline, and dataset used for the corpus-scale analysis of research on LLM limitations.

Contributors

The following have contributed to this page

Aida Kostikova
Universitat Bielefeld

How has research on large language model limitations changed since 2022?

What is it about?

Why is it important?

Perspectives

Resources

Code and data for the LLM limitations survey

Contributors

Discover more

Medical Research

Life Sciences

Physical Sciences

Technology and Engineering

Environmental Research

Arts and Humanities

Social Sciences

Business and Management

How has research on large language model limitations changed since 2022?

What is it about?

Featured Image

Why is it important?

Perspectives

Read the Original

Resources

Code and data for the LLM limitations survey

Contributors

Share this page:

Discover more

Medical Research

Life Sciences

Physical Sciences

Technology and Engineering

Environmental Research

Arts and Humanities

Social Sciences

Business and Management