علي فرضي پور

عنوان

تشخيص علائم راهنمايي و رانندگي با مبدلهاي بصري در خودروهاي خودران

مقطع تحصيلي

كارشناسي ارشد

رشته تحصيلي

مهندسي برق

سال تحصيل

1399

تاريخ دفاع

1401/12/20

استاد راهنما

دكتر شهريار برادران شكوهي

دانشكده

مهندسي برق

چكيده

در سالهاي اخير خودروهاي خودران يكي از حوزههاي مـورد توجـه پژوهشـگران بـوده اسـت. مزايـاي ايـن خودروها باعث افزايش ايمني خودرو و سرنشينان شده كه منجر به كاهش قابـل توجـه تصـادفات ميشـود. تشخيص صحيح علائم راهنمايي و رانندگي يكي از ملزومات ايمني خودرو است. براي اين كار بـه اسـتخراج ويژگيهاي تصاوير دريافتي و تشخيص نـوع علامـت بـا بهـره از ايـن ويژگيهـا نيازمنـديم. بـراي يـادگيري ويژگيهاي تصوير به روشهاي متنوع يادگيري عميق روي ميآوريم. يكي از پركاربردترين روشها، اسـتفاده از شبكههاي كانولوشني است كه مزاياي بسياري نسبت به شـبكههاي عصـبي داشـتند. از ويژگيهـاي ايـن شبكهها توجه به محل مشخصي از تصوير بود. به عبارتي ديگر، فيلترهاي شبكههاي كانولوشني در هـر گـام روابط بين پيكسلهاي يك محل از تصوير را استخراج ميكننـد. بـا معرفـي مكـانيزم توجـه 1و مبـدلها 2و موفقيت آنها در حوزه پردازش زبانهاي طبيعي ،3محققان به استفاده از آنها در چالشهاي بينايي ماشـين روي آوردند. استفاده از مبدلها نشان داد كه مكانيزم آنها روابط بين هر پيكسل از تصوير با باقي پيكسلها را استخراج ميكند. به عبارتي ديگر، مبدلهاي بصري به تصوير، به طور كلي توجه ميكنند. در اين كار، ما با تركيب هر دو روش جهت استفاده از فوايد هر دوي آنها ، يك مدل جديـد ارائـه كـردهايم. اين مدل مبتني بر ترنسفورمر و كانولوشن است. جهت بهبود مـدل در لايـه نزديـك بـه ورودي مـدل بـراي استخراج سادهتر و بهتر ويژگيها و هم چنين كاهش محاسبات از كانولوشـنهاي 1×1بـا تعـداد فيلترهـاي متفاوت جهت استخراج ويژگيها و به دنبال آن كانولوشن 3×3براي يافتن روابط بين اين ويژگيها استفاده كردهايم. همچنين جهت جلوگيري از حفظ كردن دادهها به جاي يـادگيري آنهـا از يـك متـد دادهافزايـي تركيبي استفاده كردهايم. براي ارزيابي مدل معرفي شده، آن را با 5الي 7مدل مشهور ديگـر در 3ديتاسـت مختلف مقايسه كردهايم. ما توانستهايم با مدل پيشنهادي به دقت %99.66در ديتاست اول و %99.79در ديتاست دوم دست يابيم كه بيشتر از كارهاي ديگر است. همچنين در ديتاست 60قسمتي چالشي سوم نيز در مجموع به نتـايج بهتـري نسبت به مدلهاي ديگر رسيديم. اين نتايج نشاندهندهي آن است كه مدل پيشنهادي علاوه بر دقت بالاتر ، در شرايط چالشبرانگيز هم عملكرد بهتري دارد.

تاريخ ورود اطلاعات

1402/03/01

عنوان به انگليسي

Traffic sign recognition using vision transformer in autonomous vehicles

تاريخ بهره برداري

3/10/2024 12:00:00 AM

دانشجوي وارد كننده اطلاعات

علي فرضي پور

Name: علي فرضي پور
Author: علي فرضي پور

چكيده به لاتين

In recent years, self-driving cars have been one of the areas of interest for researchers. The advantages of these cars increase the safety of the car and its passengers, which leads to a significant reduction in accidents. Accurate recognition of traffic signs is one of the essentials of car safety. For this, we need to extract the features of the received images and recognize the type of sign using these features. We turn to various deep learning methods to learn image features. One of the most widely used methods is the use of convolutional networks, which have many advantages over neural networks. One of the characteristics of these networks was to pay attention to a specific location of the image. In other words, convolutional network filters extract the relationships between the pixels of a location in the image at each step. With the introduction of the attention mechanism and transformers and their success in the field of natural language processing, researchers turned to using them in machine vision challenges. The use of transformers showed that their mechanism extracts the relationships between each pixel of the image and the rest of the pixels. In other words, vision transformers pay attention to the image in general. In this work, we have presented a new model by combining both methods to take advantage of both of them. This model is based on transformer and convolution. In order to improve the model in the layer close to the input of the model in order to extract the features more simply and better, as well as to reduce the calculations, we have used 1x1 convolutions with different number of filters to extract the features and then 3x3 convolution to find the relationships between these features. Also, in order to prevent memorizing data instead of learning it, we have used a combined data augmentation method. To eva‎luate the introduced model, we have compared it with 5-7 other well-known models in 3 different datasets . We have been able to achieve 99.66% accuracy in the first dataset and 99.8% in the second dataset with the proposed model, which is more than other works. Also, in the dataset of 61 parts of the third challenge, we achieved better results than other models. These results indicate that the proposed model, in addition to higher accuracy, also performs better in challenging conditions.

كليدواژه هاي فارسي

ترنسفورمر , يادگيري عميق , تشخيص اشيا , خودروي خودران

كليدواژه هاي لاتين

transformer , deep learning , object recognition , self-driving car

Author

Ali Farzipour

SuperVisor

Dr. Shahriar Baradaran Shokouhi

لينک به اين مدرک

https://dl.iust.ac.ir/dl/search/default.aspx?Term=28296&Field=0&DTC=6