هومن مهرآفرين

عنوان

تأثير سايز مجموعه‌داده بر روي دانش ذخيره‌شده در مدل‌هاي زباني تنظيم‌شده

مقطع تحصيلي

كارشناسي ارشد

رشته تحصيلي

مهندسي كامپيوتر

سال تحصيل

تاريخ دفاع

25 ارديبهشت 1401

استاد راهنما

دكتر محمدطاهر پيله‌ور - دكتر سيد صالح اعتمادي

دانشكده

مهندسي كامپيوتر

چكيده

با افزايش روز‌افزون كاربرد مدل‌هاي پيش‌آموزش‌داده در پردازش زبان، محققان به دنبال يافتن پاسخي براي علت عملكرد بالاي چنين مدل‌هايي هستند. در همين راستا با استفاده از روش‌هايي همانند واكاوي بازنمايي‌هاي مدل‌هاي پيش‌آموزش‌داده، به يك سري اطلاعات زباني كدگذاري شده در آن‌ها پي برده‌اند. دانش زباني مدل را مي‌توان به عنوان پاسخي براي علت عملكرد بالاي چنين مدل‌هايي در پردازش زبان تلقي نمود. اگر چه اين مدل‌ها عملكرد بسيار خوبي در اكثر تسك‌ها دارند، نياز است كه بر روي تسك هدف براي چند ايپاك تنظيم شوند كه همواره تغييراتي در وزن‌ها و بازنمايي‌هاي مدل ايجاد مي‌كند. اين تغييرات در بازنمايي‌هاي مدل به تغييرات در دانش زباني كدگذاري شده در آن‌ها نيز منجر مي‌شود. تعدادي از تحقيقات به بررسي علت مؤثر بودن تنظيم‌سازي پرداخته‌اند. چنين مطالعاتي معمولا به كمك واكاوي بازنمايي‌هاي مدل صورت مي‌گيرند. اما در بررسي‌هاي صورت گرفته از نقش سايز مجموعه‌داده تنظيم‌ساز چشم پوشي شده است. در اين گزارش اهميت اين عامل در عملكرد كاوش مدل بررسي و نشان داده شده است كه مقدارد دانش زباني كدگذاري شده به تعداد داده‌هاي آموزشي تسك تنظيم‌ساز وابسته است. بررسي‌ها همچنين نشان مي‌دهند كه مجموعه‌داده‌هاي بزرگ عمدتا بر روي لايه‌هاي بالايي اثر مي‌گذارند و مقدار اين تأثير به دليل تعداد دفعات به روز رساني مدل است. در نهايت نيز با يك سري آزمايش به بررسي تأثير سايز مجموعه‌داده در بازيابي دانش زباني تغيير يافته پرداخته شده است.

تاريخ ورود اطلاعات

1401/03/08

عنوان به انگليسي

The Effect of Data Size on the Encoded Knowledge of Fine-tuned Models

تاريخ بهره برداري

5/15/2023 12:00:00 AM

دانشجوي وارد كننده اطلاعات

هومن مهرآفرين

Name: هومن مهرآفرين
Author: هومن مهرآفرين

چكيده به لاتين

With the increasing use of pre-trained models in Natural Language Processing (NLP), researchers seek to find answers for the high performance of such models. To this end, by probing the representations of pre-trained models, they have discovered that certain linguistic features are encoded within them. The linguistic knowledge of pre-trained language models (PLMs) can be considered as the reason behind their effectiveness in NLP tasks. Even though these models have a high performance on most tasks, they still need to be fine-tuned for a few epochs on a target task. Fine-tuning leads to a change in the model’s representations and weights. These changes can also cause modifications in the model’s linguistic knowledge. Therefore, recent studies have investigated the effectiveness of fine-tuning. These studies usually analyze fine-tuning by probing the model’s representations, however, they have not taken the role of fine-tuning data size into account. In this thesis, the importance of data size in probing performance is highlighted, and it is shown that the extent of encoded linguistic knowledge depends on the number of fine-tuning samples. The analysis also reveals that larger training data mainly affects higher layers, and that the extent of this change is a factor of the number of iterations updating the model during fine-tuning rather than the diversity of the training samples. Finally, through a set of experiments, the effect of data size on the linguistic knowledge recoverability is investigated.

كليدواژه هاي فارسي

كاوش , مدل‌هاي زباني پيش‌آموزش‌داده , دانش زباني , تنظيم‌سازي

كليدواژه هاي لاتين

probe , Pre-trained Language Models , Linguistic Knowledge , Fine-tuning

Author

Houman Mehrafarin

SuperVisor

Mohammad Taher Pilehvar - Sayyed Sauleh Eetemadi

لينک به اين مدرک

https://dl.iust.ac.ir/dl/search/default.aspx?Term=26548&Field=0&DTC=6