What is it about?

This paper studies whether large language models can predict when a customer is likely to repeat an action, such as buying groceries again. We test this in a simple but important setting: given a person’s past purchase timing, can an Large Language model predict the number of days until the next purchase? We compare large language models with traditional statistical and machine learning methods. We also test whether giving the model more information about the customer and product helps. The results show that large language models can pick up some useful signals, but they are still less accurate than specialized machine learning models for this type of numerical timing prediction. Interestingly, giving the model too much extra context can sometimes make predictions worse, because the model may focus on less relevant details instead of the core timing pattern.

Featured Image

Why is it important?

Many real-world systems need to predict not only what a person may want, but also when they may need it. For example, grocery replenishment, subscription renewal, reminders, and personalized recommendations all depend on understanding timing. This work is important because it gives a clearer view of what large language models can and cannot do in structured prediction tasks. While LLMs are powerful for language and reasoning, they may struggle with precise numerical patterns in behavioral data. The study shows that carefully selected context can help, but simply adding more information is not always better. These findings can guide future AI systems that combine the flexibility of language models with the precision of traditional machine learning.

Perspectives

For me, this work is interesting because it challenges a common assumption in the current LLM era: that adding more context will automatically make a model reason better. In practical recommendation systems, the key is not just to provide more information, but to provide the right information. This study also connects closely to real industry needs. In retail and personalization, timing matters a lot. A recommendation can be technically correct but still not useful if it arrives too early or too late. I hope this work encourages more discussion on how to design hybrid AI systems that are both context-aware and quantitatively reliable.

Yanan CAO

Read the Original

This page is a summary of: Is More Context Always Better? Examining LLM Reasoning Capability for Time Interval Prediction, April 2026, ACM (Association for Computing Machinery),
DOI: 10.1145/3774904.3792900.
You can read the full text:

Read

Contributors

The following have contributed to this page