سحر احمدي سهرويه

عنوان

يادگيري عميق ابرنقطه‌اي براي طبقه‌بندي داده‌ي سه‌بعدي

مقطع تحصيلي

كارشناسي ارشد

رشته تحصيلي

الكترونيك - ديجيتال

تاريخ دفاع

1399/3/13

استاد راهنما

دكتر ستار ميرزاكوچكي

دانشكده

برق

چكيده

طي يك دهه‌ي گذشته يادگيري عميق به موفقيت‌هاي چشمگيري در زمينه‌ي پردازش و درك داده‌هاي دو بعدي دست پيدا كرده‌است و به گزينه‌اي برتر براي كارهايي نظير طبقه‌بندي، تقسيم‌بندي، تشخيص و... تبديل شده است. به همين منظور در حوزه سه بعدي نيز با استفاده از داده‌هاي غني موجود شروع به استفاده شده‌است. در حالي‌ كه به دليل ماهيت هندسي پيچيده‌ي اشياء سه بعدي و تغييرات ساختاري بزرگ ناشي از بازنمايي‌هاي مختلف سه بعدي، اين امر ساده نبوده و چالش‌هاي بسياري با خود همراه دارد. يكي از انواع بازنمايي‌هاي مهم داده‌ي سه بعدي، نمايش ابرنقطه است. ابرنقطه مجموعه‌اي از نقاط بدون مرتبه با پراكندگي متفاوت در فضاي اقليدسي سه بعدي است. اين داده برخلاف ساير بازنمايي‌ها نظير واكسل از پيچيدگي و هزينه‌ي محاسباتي كمتري برخوردار است و نزديك‌ترين نوع داده به داده‌ي خام دريافتي از دستگاه‌هاي ضبط داده‌ي سه بعدي نظير ليدار، دوربين‌هاي عمق و رادار است. همچنين افزايش كاربرد عملي در زمينه‌هاي روباتيك، اتومبيل خودران، هواپيماهاي بدون سرنشين و واقعيت مجازي سبب محبوبيت اين نوع بازنمايي از داده‌ي سه بعدي شده است. با اين حال استفاده از شبكه‌هاي عصبي كانولوشني به دليل عدم نظم ذاتي و جايگشت پذير بودن نقاط در اين نوع داده، به شيوه‌ي تصاوير دو بعدي امكان‌پذير نمي‌باشد. در اين پايان‌نامه ‌به منظور طبقه‌بندي داده‌هاي سه بعدي از شبكه‌اي با پردازش مستقيم روي بازنمايي ابرنقطه‌اي از داده‌ي سه بعدي استفاده شده است. به منظور بهبود اين شبكه استفاده از مكانيزم توجه پيشنهاد مي‌شود. براي اين منظور ماژول توجه‌اي در فضاي سه بعدي متناسب با ساختار داده¬¬ي ابرنقطه و چالش‌هاي موجود براي پردازش آن طراحي گشته كه ويژگي‌هاي غني‌تر را از ورودي استخراج مي‌كند تا امضاي جهاني بدست آمده از كل شكل حاوي اطلاعات بهتر و مفيدتري باشد. براي ارزيابي عملكرد شبكه از مجموعه داده‌ي معروف در امر طبقه‌بندي داده‌ي سه بعدي به اسم modelnet40 استفاده شد و به دقت كل 89.9 % و ميانگين دقت 87.1 % رسيديم. در انتها مقايسه‌ي نتايج ما با كارهاي ديگران نشان مي دهد كه در صورت استفاده از اين ماژول نيازي به استفاده از شبكه‌ي تراز ورودي و ويژگي نبوده و علاوه بر افزايش دقت، حجم محاسبات بسيار كاهش مي‌يابد.

تاريخ ورود اطلاعات

1399/04/23

عنوان به انگليسي

3D Data Classification By Point Cloud Based on Deep Learning

تاريخ بهره برداري

6/3/2021 12:00:00 AM

دانشجوي وارد كننده اطلاعات

سحر احمدي سهرويه

Name: سحر احمدي سهرويه
Author: سحر احمدي سهرويه

چكيده به لاتين

Over the past decade deep learning has achieved significant achievements in the area of processing and realization of two- dimensional data and has become advanced option for doing such as classification, segmentation, recognition and so forth. For this reason, it has been used in the three-dimensional field using the rich data available. However, this has not been simple because of the complex geometric nature of 3D objects and large structural changes due to various three-dimensional representations, it brings with it many challenges. One of the most important types of 3D data representations is point cloud representation. Point cloud is a collection of points with different scatter at the three-dimensional Euclidean space. This data, unlike other representation such as Voxel representation, has less complexity and computational cost and it is closest type of data to raw data than is received from 3D data recording devices such as LIDAR, Depth Cameras and RADAR. Increasing practical application in robotic, AGV, UAV and Virtual reality caused this type of representation to be popular. However, it is not possible to use convolutional neural networks due to the inherent irregularity and permeability of points in the form of two-dimensional images in this type of data. In this paper, in order to classify three-dimensional data with direct processing network on cloud point representation, three-dimensional data is used. In order to improve this network, it is recommended to use the attention mechanism. For this purpose, an attention module at three-dimensional in accordance with the data structure of cloud point and the existing challenges for its processing is designed that can extract richer features from the input so that the global signature obtained from the whole form contains better and more useful information. To evaluate the performance of the network, a well-known dataset was used in the three-dimensional data classification called modelnet40, and we reached a total accuracy of 89.9% and an average accuracy of 87.1%. Finally, a comparison of our results with the work of others shows that if this module is used, in addition to increasing accuracy, the volume of calculations and training time will be greatly reduced.

لينک به اين مدرک

https://dl.iust.ac.ir/dl/search/default.aspx?Term=22086&Field=0&DTC=6