What is it about?

We discover a surprising degree of systematic disagreement that was occasionally noted but not explained in the literature by previous authors. We find an explanation for the discrepancy between the metrics in the effect of popularity biases, which impact false and true-positive metrics in very different ways: instead of rewarding the recommendation of popular items, as with true-positive, false-positive metrics penalize the popular.

Featured Image

Why is it important?

Psychological studies have reported that bad experiences usually outperform good ones in the overall assessment. Recommender systems evaluation has been mostly focused on measuring good experiences, hence counting true-positives. The idea, in this case, is to recommend as many good items as possible. However, if bad experiences are known to cause a more substantial impact, the recommendation task could also be considered to avoid recommending as many bad items as possible. One may think that true and false positive metrics are complementary, but we found that in Recommender System evaluation, this is not always the case. Considering this last idea, metrics that measure bad experiences can provide a broader perspective on evaluation. In our study, we found the main reason for disagreement is related to a strong popularity bias in the data for training and testing algorithms. Moreover, we determine under which circumstances true or false-positive metrics should be used for offline evaluation.

Perspectives

This work is my first simultaneous collaboration with researchers from several countries. It was a challenging experience but very rewarding in the sense of being able to share ideas and learn from such talented people.

Elisa Mena-Maldonado
RMIT University

Read the Original

This page is a summary of: Agreement and Disagreement between True and False-Positive Metrics in Recommender Systems Evaluation, July 2020, ACM (Association for Computing Machinery),
DOI: 10.1145/3397271.3401096.
You can read the full text:

Read

Resources

Contributors

The following have contributed to this page