هيوا صوفي كريمي

عنوان

روش مقاوم شناسايي اشيا با الهام از بينايي انسان

مقطع تحصيلي

دكترا

رشته تحصيلي

الكترونيك

سال تحصيل

1392

تاريخ دفاع

1400/06/31

استاد راهنما

دكتر كريم محمدي

دانشكده

برق

چكيده

هدف اين پژوهش بررسي چالش‌هاي موجود در زمينه شناسايي شيء و ارائه يك مدل مقاوم و تغييرناپذير در برابر اين چالش‌ها مي‌باشد. شناسايي اشياء با چالش‌هايي نظير تغييرات مقياس، روشنايي، چرخش و بارمحاسباتي و ... مواجه مي‌باشد كه وظيفه شناسايي را براي ماشين بسيار دشوار مي‌كند. درحالي‌كه بينايي انسان در برابر تغييرات محيطي از جمله تغييرات در مقياس، روشنايي، موقعيت، و درهم‌ريختگي محيط بسيار مقاوم است. به‌گونه‌اي كه الگوريتم‌هاي بينايي ماشين فاصله زيادي با بينايي انسان دارند. از اين‌رو در اين تحقيق سعي شده است با الهام از بينايي انسان و مدل كردن قشر بينايي، هم از لحاظ ساختاري و هم از لحاظ عملكردي، يك مدل مقاوم شناسايي شيء در بينايي ماشين ارائه شود. در اين رساله علاوه بر مقاوم بودن در برابر تغييرات، به مسئله بار محاسباتي كه يكي از چالش‌هاي موجود در پياده‌سازي مي‌باشند نيز توجه ويژه شده است. در اين راستا يك مدل به نام RIMAX ارائه شد كه داراي شش لايه اصلي مي‌باشد. در اين مدل قشر اوليه در دو لايهS1,C1 و قشر ثانويه در لايه FE و ناحيه V4 در سه لايه S2, C2, FR مدل شده است. لايه‌ها در يك ساختار سلسله مراتبي همانند قشر بينايي قرار دارند و هر كدام به‌نوبه خود قسمتي از كار مقاوم‌سازي را انجام مي‌دهد. يافته‌هاي تحقيق نشان مي‌دهد كه لايه‌هاي اول و دوم نقش بسزايي در مقاوم‌سازي نسبت به روشنايي و تغييرات جزئي محلي دارند. همچنين لايه FE كه معادل قشر ثانويه مغز مي‌باشد تأثير بسزايي در افزايش دقت، تكرارپذيري و بالا بردن قابليت اطمينان مدل دارد. اين لايه كه وظيفه آن استخراج ويژگي‌هاي غيرتصادفي تصوير است، كارايي مدل را مخصوصاً هنگامي‌كه تعداد نمونه‌هاي آموزشي كم باشد و يا زماني كه تعداد ويژگي‌هاي كمكي در اختيار داشته باشيم به‌طور ميانگين 20% افزايش مي¬دهد. لايه FR وظيفه مقاوم‌سازي در برابر مقياس و چرخش را ايفا مي‌كند و لايه‌هاي S2 و C2 وظيفه بررسي تطبيق را برعهده دارند. نتايج نشان مي‌دهند كه RIMAX در برابر تغييرات مقياس در بازه 6/0 تا 2 برابري، بالاي 80% باقي‌مانده است و همچنين در برابر تغييرات زاويه¬ در بدترين حالت دقت مدل نهايتاً 15% درصد افت كرده است. بهبود حاصله در زمينه سرعت بسيار چشم¬گير بوده و مي¬توان گفت كه سرعت RIMAX تقريباً يازده برابر HMAX و نسبت به AlexNet بدون در نظر گرفتن فاز آموزش تقريباً چهار برابر بوده است.

تاريخ ورود اطلاعات

1400/08/30

عنوان به انگليسي

Invariant Object Recognition Inspired by Human Vision

تاريخ بهره برداري

9/22/2022 12:00:00 AM

دانشجوي وارد كننده اطلاعات

هيوا صوفي كريمي

Name: هيوا صوفي كريمي
Author: هيوا صوفي كريمي

چكيده به لاتين

In this dissertation, we attempted to present an object recognition model, which is inspired by the human visual system, to cope with robustness and invariance challenges in machine vision. There are many challenges in the object recognition tasks such as illumination, scale, and rotational changes that make object recognition hard for machines. Besides, the human visual system is very robust to these challenges. Therefore, we presented a robust and invariant model for object recognition, which mimics the human visual system both structurally and functionally. In addition to improving robustness against image variation challenges, the computational load has been reduced. Accordingly, a model called RIMAX was proposed, which has six main layers. In RIMAX, the primary cortex was modeled helping two layers, S1, C1; also, the secondary cortex was modeled by the FE layer; and, the V4 region was modeled by three layers including FR, S2, C2. Each layer has an especial role in robustness improvement. Similar to the visual cortex, RIMAX has a hierarchical structure. Researches show that the first and second layers play an important role in robustness against light and minor local changes. Also, the FE layer which represents the secondary visual cortex plays a critical role in increasing accuracy, repeatability, and reliability. This layer, whose task is feature extraction from images, enhances the performance of the RIMAX, especially when there are not enough training samples or features. The FE layer improves robustness against scale and rotational variation. and template matching is implemented in S2 and C2 layers. Using the new template matching method, the computational load is significantly reduced. The results show that the RIMAX has 20% higher performance in terms of accuracy than previous models when threre aren't enough training samples. In addition, in terms of robustness against mentioned challenges, it has shown much better performance. For example, the accuracy of RIMAX has remained above 80% against scale variation in the range of 0.6 to 2 times. Also, in the worst case, the accuracy of RIMAX has decreased by only 15% against angle changes. The RIMAX has shown better performance in terms of speed rather than other previous models. Practically, RIMAX is eleven times faster than HMAX, and four time faster than AlexNet.

كليدواژه هاي فارسي

لايه‌هاي قشر بينايي , شناسايي اشياء , مقاوم در برابر تغييرات , بينايي انسان , بار محاسباتي

كليدواژه هاي لاتين

HMAX , Visual cortex layers , Object recognition , Robustness , Human visual system , Computational load

لينک به اين مدرک

https://dl.iust.ac.ir/dl/search/default.aspx?Term=25585&Field=0&DTC=6