What is it about?

The neural network model has achieved good results in the similarity calculation task of sentences or short texts. However, the effect of existing similarity algorithms on long texts is not ideal, and they cannot truly extract richer semantic information hidden in the structure of long text documents. This article aims to build a learning model that can more accurately express the semantics of long texts and solve the bottleneck of calculating long text similarity.

Featured Image

Why is it important?

This article ingeniously integrates the characteristics of Chinese long texts in terms of grammatical structure into the Bert model, proposing a semantic progressive fusion model from word → sentence → text, which maximizes the preservation of the true semantics of long texts and improves the accuracy of long text similarity calculation.

Perspectives

In recent years, my students and I have been engaged in research on Chinese text similarity. Due to the significant differences between Chinese and English, many models that are suitable for English texts may not be very effective when transplanted to Chinese texts. This article is a small breakthrough in our research work, and we hope it can provide some inspiration for relevant researchers.

Xiao Li
Anyang Normal University

Read the Original

This page is a summary of: Chinese long text similarity calculation of semantic progressive fusion based on Bert, Journal of Computational Methods in Sciences and Engineering, August 2024, IOS Press,
DOI: 10.3233/jcm-247245.
You can read the full text:

Read

Contributors

The following have contributed to this page