حميد نبي ‌لو

عنوان

افزايش دقت و كاهش هزينه ها در تشخيص استرس از متن با استفاده از يادگيري فعال

مقطع تحصيلي

كارشناسي ارشد

رشته تحصيلي

مهندسي كامپيوتر گرايش هوش مصنوعي

سال تحصيل

1401

تاريخ دفاع

1404/04/18

استاد راهنما

محمدرضا جاهد مطلق

استاد مشاور

دانشكده

مهندسي كامپيوتر

چكيده

در سال‌هاي اخير، با گسترش چشمگير استفاده از شبكه‌هاي اجتماعي و توليد حجم عظيمي از داده‌هاي متني، فرصت‌هاي نويني براي تحليل وضعيت رواني افراد از طريق هوش مصنوعي فراهم شده است. يكي از مسائل مهم در اين زمينه، تشخيص به‌موقع استرس رواني است كه مي‌تواند نقش مؤثري در پيشگيري از اختلالات جدي‌تري مانند افسردگي ايفا كند. باتوجه‌به اهميت تشخيص زودهنگام استرس، هدف اين پژوهش توسعه مدلي هوشمند براي تشخيص استرس از متن‌ فارسي باتكيه‌بر روش‌ يادگيري فعال است. علي‌رغم اين موضوع، در زبان فارسي فقدان مدل‌هاي كارآمد براي تشخيص استرس از متن به‌عنوان يك چالش اساسي مطرح است؛ چالشي كه نيازمند گردآوري مجموعه داده‌ مناسب، طراحي روش‌هاي برچسب‌گذاري و انتخاب مدل‌ يادگيري متناسب با ويژگي‌هاي زبان فارسي است. اين كاستي، توسعه سامانه‌هاي هوشمند تحليل رواني مبتني بر متن را با محدوديت‌هايي روبه‌رو كرده و لزوم ارائه راه‌حلي مؤثر و بومي‌ را برجسته مي‌سازد. راه‌حل پيشنهادي شامل دو رويكرد مجزا براي برچسب‌گذاري داده‌ها و آموزش مدل است. در روش اول، داده‌هاي متني جمع‌آوري‌شده از شبكه اجتماعي توييتر به‌صورت كامل توسط انسان برچسب‌گذاري شده و سپس با استفاده از يك شبكه عصبي عميق آموزش داده شده‌اند. در روش دوم، رويكردي مبتني بر يادگيري فعال به‌كار گرفته شده است كه در آن، از يك مدل زباني بزرگ براي برچسب‌گذاري اوليه داده‌ها استفاده شده و با بهره‌گيري از معيار ضريب اطمينان، داده‌هاي چالش‌برانگيز شناسايي و براي برچسب‌گذاري انساني انتخاب شده‌اند. از اين معيار نيز در فرآيند يادگيري فعال استفاده شده است تا با تمركز بر نمونه‌هاي دشوار، كيفيت مجموعه آموزشي ارتقا يافته و نمونه‌هاي غيرمفيد به‌طور قابل‌توجهي كاهش يابد. درنهايت، مقايسه ميان دو رويكرد نشان داد كه استفاده از يادگيري فعال، علاوه بر بهبود قابل‌توجه دقت مدل، باعث كاهش چشمگير هزينه‌هاي مالي، زماني و نيروي انساني مورد نياز براي برچسب‌گذاري مي‌شود؛ اما دو چالش اساسي نيز در اين چارچوب وجود دارد. نخست، انتخاب مقدار آستانه‌ي ضريب اطمينان كه به ويژگي‌هاي داده‌ها وابسته بوده و ممكن است در شرايط مختلف بهينه نباشد. دوم، وابستگي روش به مدل‌هاي زباني بزرگ در برچسب‌گذاري اوليه كه در مواردي ممكن است منجر به توليد برچسب‌هاي نادرست، حتي با وجود مقدار بالاي ضريب اطمينان گردد.

تاريخ ورود اطلاعات

1404/05/20

عنوان به انگليسي

Increasing accuracy an‎d reducing costs in stress detection from text using Active Learning

تاريخ بهره برداري

7/9/2025 12:00:00 AM

دانشجوي وارد كننده اطلاعات

حميد نبي لو

Name: حميد نبي لو
Author: حميد نبي لو

چكيده به لاتين

In recent years, the widespread use of social media an‎d the massive generation of textual data have opened new opportunities for analyzing individualsʹ mental states using artificial intelligence. One critical task in this domain is the timely detection of psychological stress, which can play a significant role in preventing more severe disorders such as depression. Given the importance of early stress detection, this study aims to develop an intelligent model for detecting stress in Persian texts using an active learning approach. However, the lack of efficient models for stress detection in Persian remains a major challenge. This gap highlights the need for appropriate dataset collection, labeling strategies, an‎d learning models tailored to the characteristics of the Persian language. Such limitations have hindered the development of effective text-based psychological analysis systems, underscoring the necessity for a localized an‎d practical solution. The proposed solution consists of two distinct approaches for data labeling an‎d model training. In the first approach, textual data collected from Twitter is fully annotated by human experts an‎d then used to train a deep neural network. In the second approach, an active learning framework is employed: a large language model is used for initial labeling, an‎d a confidence score is applied to identify challenging samples for manual annotation. This confidence score is also leveraged throughout the active learning process to focus on difficult examples, thereby enhancing the quality of the training set an‎d significantly reducing the number of non-informative samples. Ultimately, the comparison between the two approaches revealed that active learning not only significantly improves model accuracy but also greatly reduces the financial, time, an‎d human costs of data annotation. Nonetheless, two main challenges remain in this framework: first, selec‎ting the optimal confidence threshold, which depends on the data characteristics an‎d may vary under different conditions; an‎d second, the reliance on large language models for initial labeling, which can occasionally lead to incorrect labels even at high confidence scores.

كليدواژه هاي فارسي

مدل تشخيص استرس از متن فارسي , پردازش زيان طبيعي , يادگيري فعال , يادگيري عميق , هوش مصنوعي

كليدواژه هاي لاتين

Persian text-based stress detection model , natural language processing , active learning , deep learning , artificial intelligence

Author

Hamid Nabilou

SuperVisor

MohammadReza Jahed Motlagh

لينک به اين مدرک

https://dl.iust.ac.ir/dl/search/default.aspx?Term=33543&Field=0&DTC=6