غزاله محمودي

عنوان

تشخيص موضع متني در شبكه‌هاي اجتماعي

مقطع تحصيلي

كارشناسي ارشد

رشته تحصيلي

مهندسي كامپيوتر

سال تحصيل

1400

تاريخ دفاع

1402/11/30

استاد راهنما

دكتر سيد صالح اعتمادي

استاد مشاور

ندارم

دانشكده

مهندسي كامپيوتر

چكيده

امروزه شبكە هاي اجتماعي بستري جهت بيان آزادانه عقايد و اشتراك گذاشتن نظرات مي‌باشد. اين موضوع سبب شده است كه با تحليل دادە هاي موجود در شبكە‌هاي اجتماعي بتوان ديد وسيع و جامعي از موضع كاربران متفاوت نسبت به موضوعات مختلف به دست آورد. از جمله اين موضوعات مي‌توان به مسائل سياسي، اقتصادي، اجتماعي و فرهنگي اشاره كرد. در پردازش زبان طبيعي، به فرآيند تشخيص خودكار موضع متن نسبت به موضوعي مشخص و معين، تشخيص موضع گفته مي‌شود. در مسائل پردازش زبان طبيعي از جمله تشخيص موضع، نحوه پيش پردازش دادە هاي متني در عملكرد مدل آموزش ديده تاثير به سزايي دارد. در اين پژوهش هفت سطح مختلف پيش پردازش معرفي شده و مورد بررسي قرار مي‌گيرد. علاوه بر اين، براي يافتن معماري مدل تشخيص موضع، از ايده جستجو معماري عصبي الهام گرفته شد. در اين روش با تقسيم معماري مدل به چهار بخش اصلي و تعريف فضاي جستجو براي هر بخش و استفاده از الگوريتم جستجو تطبيقي، معماري نهايي طراحي مي‌شود. در نهايت بهترين مدل پيشنهاد شده از كدگذار BERTweetو ردە بند CNNاستفاده مي‌كند. معماري طراحي شده توانست به 74.47درصد در معيار F1دست يابد و نسبت به مدل پايه 19.97درصد بهبود داشته باشد. همچنين روش ارائه شده رتبه سوم را بين 19 شركت كننده رويداد تشخيص موضع در تغييرات اقليمي كسب كرد. از سوي ديگر با توجه به كمبود دادە هاي آموزشي براي موضوعات متفاوت، تشخيص موضع بدون داده آموزشي نيز مورد بررسي قرار گرفت. در اين روش كه از مدل هاي زباني بزرگ و مهندسي پرامپت استفاده مي‌كند، چهار رويكرد بر اساس انواع مختلف پرامپت معرفي شد. سپس عملكرد پرامپت‌هاي پيشنهادي با ساير روش‌هاي تشخيص موضع بدون داده آموزشيͬ مورد مقايسه قرار گرفت. رويكرد معرفي شده توانست به مقدار 57.33درصد در معيار F1 دست يابد و نسبت به رويكرد هاي مشابه 2.03 درصد بهبود داشته باشد.

تاريخ ورود اطلاعات

1403/03/29

عنوان به انگليسي

Stance Detection for Textual Content in Social Media

تاريخ بهره برداري

1/1/1900 12:00:00 AM

دانشجوي وارد كننده اطلاعات

غزاله محمودي

Name: غزاله محمودي
Author: غزاله محمودي

چكيده به لاتين

Nowadays, social media is a platform for freely expressing and sharing opinions and thoughts. This leads to the fact that by analyzing the data available on social media, a broad and comprehensive perspective on various users’ opinions and sides about different topics could be gained. These topics include political, economic, social, and cultural issues. In Natural Language Processing, stance detection is the process of automatically recognizing the side and stance of a given text about a specific target. In natural language processing tasks, the way text data is preprocessed significantly affects the performance of the trained model. In this research, seven different levels of preprocessing are introduced and examined. Additionally, to find the architecture of the stance detection model, the idea of Neural Architecture Search (NAS) was inspired. In this method, the model architecture is divided into four main parts, a search space is defined for each part, and adaptive search algorithms are used to design the final architecture. The best proposed model ultimately utilizes BERTweet as the encoder and a CNN classifier. The proposed architecture achieved an F1-Score of 74.47%, showing a 19.97% improvement over the Baseline model. Furthermore, the proposed method ranked third among 19 participants in a climate change stance detection event. Additionally, due to the lack of training data for different topics, stance detection without training data was also investigated. This approach, which uses large language models and pro‎mp‎t engineering, introduces four approaches based on different pro‎mp‎t types. Then, the performance of the proposed pro‎mp‎ts was compared with other methods for stance detection without training data. The introduced approach achieved an F1-Score of 57.33%, showing a 2.03% improvement over similar approaches.

كليدواژه هاي فارسي

تشخيص موضع , پردازش زبان طبيعي , يادگيري عميق , مدل‌هاي زباني بزرگ

كليدواژه هاي لاتين

Stance Detection , Natural Language Processing , Deep Learning , Large Language Models

Author

Ghazaleh Mahmoudi

SuperVisor

Sauleh Eetemadi

لينک به اين مدرک

https://dl.iust.ac.ir/dl/search/default.aspx?Term=30958&Field=0&DTC=6