شماره ركورد
8149
عنوان
Sentiment Analysis of persian informal texts using embedded informal words and attention-based LSTM network
سال تحصيل
98-97
استاد راهنما
دكتر بهروز مينايي
استاد مشاور
دكتر سجادي
چکيده
Abstract— The massive volume of comments on websites and social networks has made it possible to raise awareness of people's beliefs and preferences regarding goods and services on a large scale. To this end, sentiment analysis, which refers to the determination of the sentiment of texts, has been proposed as an intelligent solution. From a methodological point of view, the recent combination of words embedding and deep networks has become an effective approach to sentiment analysis. In this approach, words embedding is based on the training of deep networks on large texts corpus that results in the production of corresponding word vectors. In Persian and in previous ways, official corpus such as Wikipedia dumps have been used. The serious difference between official and informal texts in Persian makes the resulting vectors, in the context of users' comments on social networks and websites often written in informal form, not performing well. In order to overcome this weakness, this paper provides a large text corpus of integration of several different sources of informal comments and is constructed and words vectors using the fasttext algorithm are created. To optimize using these vectors, a attention-based LSTM network is suggested; Because this model enables each word to play an important role in determining the sentiment of the text. The proposed method is evaluated on the two “Taaghche” and “Filimo” datasets presented in this paper. The results indicate the significant advantage of using informal vectors in sentiment analysis. The results also show that applying the Attention Model enhances the performance of the deep network in the sentiment analysis of Persian texts.
Keywords— Sentiment Analysis, Words Embedding, LSTM Network, Attention Model
نام دانشجو
محسن بختيار
تاريخ ارائه
9/2/2020 12:00:00 AM
متن كامل
70364
پديد آورنده
محسن بختيار
تاريخ ورود اطلاعات
1399/12/05
عنوان به انگليسي
Sentiment Analysis of persian informal texts using embedded informal words and attention-based LSTM network
كليدواژه هاي لاتين
Keywords— Sentiment Analysis, Words Embedding, LSTM Network, Attention Model