سيد سعيد جان نثار

عنوان

يك الگوريتم تكاملي انتخاب ويژگي براي بهبود دقت رده‌بندي در مجموعه داده‌هاي با ابعاد بالا

مقطع تحصيلي

كارشناسي ‌ارشد

رشته تحصيلي

هوش مصنوعي و رباتيك

سال تحصيل

93-94

تاريخ دفاع

۱۳۹۶/۱۲/۱۹

استاد راهنما

دكتر امين نيك انجام

دانشكده

كامپيوتر

چكيده

چكيده انتخاب ويژگي نقش مهمي را در بهبود و تقويت عملكرد الگوريتم‌هاي يادگيري ماشين ايفا مي‌كند. و زمان لازم براي ايجاد مدل يادگيري را كاهش و همچنين دقت فرآيند يادگيري را افزايش مي‌دهد. شناسايي روش انتخاب ويژگي مناسب براي كارهاي مربوط به يادگيري ماشين كه داراي داده‌هاي با ابعاد بالا هستند بسيار با اهميت مي‌باشد. داده‌هاي با ابعاد بالا حاوي ويژگي‌هاي نامناسب و افزونه هستند يعني ويژگي‌هايي كه اطلاعات مفيدي را فراهم نكرده و نمي‌توانند در فرآيند يادگيري شركت كنند و نيز حاوي اطلاعات تكراري و مشابه هستند و نسبت به ويژگي‌هاي انتخاب شده فعلي اطلاعات بيشتري را فراهم نمي‌كنند، موجب گمراهي و هدايت نادرست فرآيند يادگيري مي‌شوند. انتخاب ويژگي، تكنيك كاهش ابعاد از طريق حذف تعدادي از ويژگي‌ها است. در اين پايان‌نامه يك روش انتخاب‌ ويژگي جديد به‌نام Hybrid Evolutionary Feature Selection را براي فضاي جستجوي با ابعاد بالا و با رويكرد مبتني بر الگوريتم ژنتيك ارائه مي‌دهيم. سيستم پيشنهادي قادر است تا تعداد ويژگي‌ها را براي استفاده در مرحله رده‌بندي كاهش دهد و به دو بخش اصلي تقسيم مي‌گردد. اولين بخش يك روش رتبه‌بندي ويژگي را براي كاهش تعداد ويژگي‌ها به‌كار مي‌گيرد و بخش بعدي از يك راهبرد جستجوي مبتني بر الگوريتم ژنتيك به‌عنوان روش پوششي، براي يافتن يك زيرمجموعه از ويژگي‌ها با قدرت تمايز بالا، استفاده مي‌نمايد. در بخش دوم به‌دليل استفاده از شيوه پوششي نياز به يك رده‌بند به‌عنوان تابع برازندگي داريم كه رده‌بند نزديك‌ترين همسايه مجاور انتخاب گرديد و از اين‌رو شيوه جديدي نيز براي يافتن بهترين مقدار پارامتر "كا" جهت اين رده‌بند، ارائه شده و همچنين معيار ارزشيابي جديدي به‌نام "اهميت نسبي" را براي رسيدن به ويژگي‌هاي مفيدتر در كنار معيار دقت، معرفي و استفاده مي‌كنيم. در پايان براي نشان دادن اثربخشي اين سيستم، آزمايش‌هايي با ساير الگوريتم‌هاي تكاملي انجام و نتايج به‌دست آمده از آنها با الگوريتم پيشنهادي، مورد مقايسه قرار گرفتند. واژه‌هاي كليدي: انتخاب ويژگي تكاملي، الگوريتم ژنتيك، رده‌بندي، انتخاب ويژگي در مجموعه‌هاي دادهاي با ابعاد بالا

تاريخ ورود اطلاعات

1397/04/02

عنوان به انگليسي

An Evolutionary Feature Selection Algorithm In High-Dimensional Data Set To Improve Classification Accuracy

تاريخ بهره برداري

6/23/2018 12:00:00 AM

دانشجوي وارد كننده اطلاعات

سيد سعيد جان نثار

Name: سيد سعيد جان نثار
Author: سيد سعيد جان نثار

چكيده به لاتين

Abstract: Feature selection plays a more significant role in improving the performance of the machine learning algorithms in terms of reducing the time to build the learning model and increasing the accuracy in the learning process. Identifying the suitable feature selection method is very essential for a given machine-learning task with high dimensional data. The high dimensional data contains irrelevant and the redundant features. The irrelevant features cannot involve in the learning process and the redundant features contain same information hence thy miss lead the learning process. The feature selection is a process of removing the redundant and the irrelevant features from a dataset to improve the performance of the machine learning algorithms. In this thesis presented a novel GA-based approach we named “HEFS” for feature selection in high dimensional spaces. The proposed system is able to greatly reduce the number of features to be used in the classification phase. The system is based on two modules. The first employs a feature ranking method to reduce the number of features to be taken into account. The second module uses a GA-based search strategy that uses a wrapper fitness function for finding feature subsets with a high discriminative power. In the second module as a wrapper approach, we need a learning algorithm to work as fitness function and we choose kNN classifier. Therefore, we also introduced an automatic mechanism for finding the suitable k-value for that classifier. In addition, we introduced a novel measure we named “relative importance” which obtained from the first module, to considering only useful features alongside the accuracy measure. Finally, In order to assess the effectiveness of the proposed system, several experiments have been performed and the obtained results have been compared with those achieved by five different evolutionary feature selection algorithms. Keywords: Evolutionary Feature Selection, Genetic Algorithm, classification, high dimensional data

لينک به اين مدرک

https://dl.iust.ac.ir/dl/search/default.aspx?Term=19069&Field=0&DTC=6