آزاده سادات موسوي

عنوان

بازشناسي مقاوم زماني- مكاني انسان در يك سيستم‌ نظارتي بر اساس شبكه GAN

مقطع تحصيلي

كارشناسي ارشد

رشته تحصيلي

مهندسي برق- گرايش سيستم هاي الكترونيك ديجيتال

سال تحصيل

1398-1400

تاريخ دفاع

1400/11/20

استاد راهنما

جناب دكتر شهريار برادران شكوهي

دانشكده

برق

چكيده

نظارت ويدئويي هوشمند از كاربردهاي اصلي در بينايي ماشين مي‌باشد. بازشناسي انسان به‌عنوان بخشي از اين سيستم‌ها از اهميت ويژه‌اي برخوردار است. به‌طوري‌كه صحت عملكرد در اين بخش منجر به كارآمدي انواع الگوريتم‌هاي نظارتي مي‌گردد. بازشناسي افراد درواقع انطباق دو تصوير از يك شخص در حال پياده‌روي از دو منظر متفاوت است. اين مفهوم در سال‌هاي اخير به خاطر كاربردهاي گسترده‌اي كه در تجسس‌هاي ويدئويي دارد، توجه بسياري را به خود جلب كرده و تحقيقات قابل‌توجهي در مورد آن صورت گرفته است؛ اما اين مسئله، همچنان به‌عنوان يك چالش شناخته‌شده و مطالعات بيشتري را مي‌طلبد كه دليل آن تغييرات زياد شدت نور، ژست‌ها، منظرها و پيش‌زمينه تصاوير افراد پياده و غيره مي‌باشد. روش‌هاي جديد بازشناسي بر يادگيري ويژگي‌هاي متمايز تمركز مي‌كنند كه تنها نسبت به يك فاكتور خاص از تغييرات (مانند ژست انساني) كه به سيگنال‌هاي نظارتي مربوطه (مثلاً نقاط كليدي ژست) نياز دارد، مقاوم هستند. همچنين بيشتر اين روش‌ها محدوديت مكاني-زماني را ناديده مي‌گيرند به همين خاطر وقتي پايگاه داده گالري در عمل بسيار بزرگ باشد، اين رويكردها به دليل ابهام ظاهري در نماهاي مختلف دوربين، نمي‌توانند عملكرد خوبي داشته باشند. براي مقابله با اين مشكلات، ما از يك چارچوب شامل دو جريان كه اطلاعات معنايي بصري و اطلاعات مكاني-زماني را استخراج مي‌كند، استفاده مي‌كنيم؛ براي اين منظور در شاخه جريان بصري از قابليت توليد تصوير شبكه‌هاي مولد-متخاصم استفاده مي‌كنيم و شبكه مولد را به نحوي آموزش مي‌دهيم تا ويژگي‌هاي مرتبط با هويت و نامرتبط را با استفاده از برچسب‌هاي شناسايي بدون نياز به هيچ‌گونه اطلاعات كمكي از تصاوير ورودي افراد جدا ‌كند تا روش مطرح‌شده نسبت به تغييرات درون كلاسي در مسئله بازشناسي مقاوم شود. ويژگي‌هاي مرتبط با هويت حاوي اطلاعات مفيدي براي تعيين يك شخص خاص (مانند لباس) هستند، درحالي‌كه ويژگي‌هاي غير مرتبط با هويت، عوامل ديگري (مانند ژست انسان، تغييرات مقياس) را در خود دارند. در شاخه جريان مكاني-زماني نيز براي تقريب توزيع احتمال پيچيده مكاني-زماني، از يك روش هيستوگرام-پارزن سريع بهره جستيم تا با كمك محدوديت مكاني-زماني، مدل پيشنهادي بسياري از تصاوير نامربوط را حذف كند و پايگاه داده گالري را محدود سازد و درنهايت براي ادغام دو نوع اطلاعات ناهمگن در يك چارچوب يكپارچه از يك متريك شباهت مشترك استفاده كرديم. نتايج تجربي اثربخشي اين روش را نشان مي‌دهد كه به‌ دقت مرتبه اول 93.8٪ و mAP، 87.3٪ بر روي مجموعه داده DukeMTMC-reID و دقت مرتبه اول، 98.2٪ و mAP، 91.5٪ روي مجموعه داده Market-1501 دست مي‌يابد كه نسبت به كارهاي مشابه قبلي بهبود 1٪ روي معيار مرتبه اول و 4.8٪ روي معيار mAP دارد.

تاريخ ورود اطلاعات

1400/12/16

عنوان به انگليسي

Robust Spatial-Temporal person re-identification in a monitoring system based on GAN network

تاريخ بهره برداري

2/9/2023 12:00:00 AM

دانشجوي وارد كننده اطلاعات

آزاده سادات موسوي

Name: آزاده سادات موسوي
Author: آزاده سادات موسوي

چكيده به لاتين

Smart video surveillance is one of the main applications in machine vision. Person re-identification as part of these systems is of particular importance. As the accuracy of performance in this area leads to the efficiency of various monitoring algorithms. Person re-identification is actually matching two images of a person walking from two different views. This concept has attracted a lot of attention in recent years due to its wide applications in video surveillance and considerable research has been done on it; However, this issue is still recognized as a challenge and requires further study due to the large changes in light intensity, poses, views and background images of pedestrians and so on. New re-identification methods focus on learning distinctive features that are resistant only to a specific factor of change (such as a human pose), which requires corresponding supervisory signals (such as pose annotations). Also, most of these methods ignore spatial-temporal constraints, so when the gallery database is too large in practice, these approaches may not work well due to the apparent ambiguity in the various camera views. To address these problems, we use a two-stream framework that extracts visual semantic information and spatial-temporal information; For this purpose, in the field of visual stream, we use the ability to produce images of GAN networks and train the Generator network in a way to separate identity-related and unrelated features by using identification labels without the need for any auxiliary information from incoming images to make the proposed method resistant to intra-class changes in the issue of re-identification. Identity-related traits contain useful information to identify a particular person (such as clothing), while non-identity traits contain other factors (such as human pose, scale changes). In the spatial-temporal stream branch, to approximate the complex spatial-temporal probability distribution, we used a fast Histogram-Parzen method to remove many irrelevant images and constrain the gallery database with the help of spatial-temporal constraint. Finally, we used a joint similarity metric to integrate two types of heterogeneous information into an integrated framework. Experimental results show the effectiveness of this method, which achieves rank-1 accuracy of 98.2% and mAP 91.5% on the Market-1501 dataset, and rank-1 accuracy of 93.8% and mAP 87.3% on the DukeMTMC-reID dataset, which is higher than Previous work performs better.

كليدواژه هاي فارسي

بازشناسي شخص , اطلاعات مكاني و زماني , اطلاعات معنايي بصري , شبكه مولد-متخاصم

كليدواژه هاي لاتين

Person re-identification , spatial and temporal information , visual semantic information , GAN network

Author

Azadeh sadat mousavi

SuperVisor

Dr. Shahriar B. Shokouhi

لينک به اين مدرک

https://dl.iust.ac.ir/dl/search/default.aspx?Term=26193&Field=0&DTC=6