What is it about?

Sentiment analysis (SA) is a hot topic in data mining and natural language processing. Most of studies in SA address the English language and the study of SA in Persian has started from 2012. Besides supervised machine learning methods which needs annotated corpora for training, lexicon-based methods showed promising results in SA. The main part of this methods is the lexicon used for scoring users' comments. We compared different existing lexicons and create two new resources to address the problem of resource scarcity for SA in Persian. a carefully labeled lexicon of sentiment words, PerLex, and a new handmade dataset of about 16,000 rated documents, PerView.

Featured Image

Why is it important?

In addition to creating the PerLex and PerView, a new hybrid method using both ML and the lexicon-based approach is presented in which PerLex words are used to train the ML algorithm. Results indicate that the accuracy of PerLex is higher than the existing CNRC, Adjectives, SentiStrength, PerSent, and LexiPers lexicons. In addition,the results show that using PerLex significantly decreases the execution time of the proposed system in comparison to the above-mentioned lexicons.

Perspectives

Writing this article was a great pleasure as I believe it is a step forward in SA in the Persian language. I hope this article and the resources introduced in it help researchers in the field of SA.

Mohammad Ehsan Basiri
Shahrekord University

Read the Original

This page is a summary of: Words Are Important, ACM Transactions on Asian and Low-Resource Language Information Processing, December 2018, ACM (Association for Computing Machinery),
DOI: 10.1145/3195633.
You can read the full text:

Read

Resources

Contributors

The following have contributed to this page