سينا باغباني جم

عنوان

كالبدشكافي و آناليز شبكه‌هاي مولد متقابل با تمركز روي عملكرد و ويرايش‌پذيري با حضور داده‌هاي چهره

مقطع تحصيلي

كارشناسي ارشد

رشته تحصيلي

مهندسي برق - مخابرات امن و رمزنگاري

سال تحصيل

1398

تاريخ دفاع

1401/04/25

استاد راهنما

دكتر سيد علي اصغر بهشتي شيرازي

دانشكده

مهندسي برق

چكيده

شبكه‌هاي مولد متقابل (متخاصم) در سال هاي اخير به شدت مورد توجه قرار گرفته و پيشرفت‌هاي بسياري هم در خود ساختار شبكه‌ها و هم در كاربردهاي آن ها صورت گرفته است. يكي از كاربردهاي شبكه هاي مولد متقابل سنتز تصوير است كه بخش عمده اي از آن به سنتز تصاوير چهره مربوط است. با وجود پيشرفت هاي شايان در حوزه سنتز تصوير چهره توسط شبكه‌هاي مولد متقابل، هنوز هم درك درستي از نحوه كار و يادگيري اين شبكه‌ها شكل نگرفته است. علاوه بر اين، هزينه پردازشي زياد مانع از در دسترس بودن ويرايش و تعليم مجدد اين شبكه‌ها شده است. معمولا از افزودن شبكه‌هاي ديگر در كنار شبكه اصلي براي ايجاد قابليت‌هاي ويرايشي استفاده مي‌شود كه اين كار نيز قدرت پردازشي زيادي مي‌طلبد. در كنار اين موارد، بهبود كيفيت تصاوير سنتز شده نيز همواره مورد توجه پژوهشگران در اين حوزه بوده است. به همين جهت، در اين پژوهش با كالبدشكافي شبكه‌هاي مولد متقابل تعليم يافته روي مجموعه داده چهره‌ها و تقسيم‌بندي كلاس‌هاي موجود در آن‌ها و شناسايي واحدهايي كه مفاهيم قابل درك براي انسان را تشخيص مي‌دهند، سعي شده ديدي بهتر از نحوه كار و يادگيري مفاهيم توسط شبكه‌هاي مولد متقابل حاصل شود. در ادامه و با استفاده از واحدهاي شناسايي شده در بخش كالبدشكافي، به بررسي قابليت ويرايش تصاوير چهره در شبكه‌هاي از پيش تعليم يافته پرداخته و امكان ارتباط ويرايش مفاهيم فيزيكي با ويرايش مفاهيم انتزاعي بررسي شده است. همچنين، اثر ويرايش‌هاي انجام شده در تصوير نهايي با ايجاد تغييرات در بخش‌هاي مختلف نقشه ويژگي در لايه پايين‌تر و سنجش ميزان حضور آن‌ها در تصوير نهايي بررسي شده است. در نهايت تلاش شده تا با شناسايي و حذف واحد‌هاي مشكل‌زا از شبكه، كيفيت تصاوير سنتز شده بهبود يابد. به همين جهت، 22 واحد مشكل‌زا از شبكه Progressive GAN تعليم يافته روي مجموعه داده CelebA-HQ شناسايي و حذف شده‌اند. اين بهبود با معيار FID سنجيده شده كه از عدد 39.10 در شبكه اصلي به 28.74 در شبكه بهبوديافته كاهش يافته است. همچنين براي مقايسه بهتر، از يك فرد معمولي و يك فرد متخصص خواسته شده 100 زوج تصوير شامل يك تصوير از شبكه اصلي و يك تصوير از شبكه بهبوديافته را مشاهده و تصوير بهتر را انتخاب كنند. در اين بررسي نيز 53 تصوير از 100 تصوير انتخاب شده توسط فرد معمولي و 57 تصوير از 100 تصوير انتخاب شده توسط فرد متخصص متعلق به شبكه بهبوديافته بوده كه گوياي برتري اين شبكه نسبت به شبكه اصلي است. علاوه بر اين، تلاش شده تا توزيع آماري تصاوير سنتز شده نيز بهبود يابد. به اين منظور، 1000 تصوير سنتز شده از شبكه و 1000 تصوير از مجموعه داده ffhq به طور تصادفي انتخاب شده و توزيع آماري كلاس‌هاي داخل اين تصاوير با تعداد پيكسل مربوطه با استفاده از شبكه تقسيم‌بندي تعيين شده است. سپس با توجه به تفاوت‌هاي آشكار شده بين اين دو دسته، سعي شده تا با كاهش و افزايش واحدهاي شناسايي شده مربوط به كلاس‌هاي داراي تفاوت شديد، توزيع اين دو دسته از تصاوير به هم نزديك شود كه اين تلاش تا حد زيادي موفق بوده است.

تاريخ ورود اطلاعات

1401/04/28

عنوان به انگليسي

Dissection and Analysis of GANs Focused on Performance and Editability Featuring Face Data

تاريخ بهره برداري

1/1/1900 12:00:00 AM

دانشجوي وارد كننده اطلاعات

سينا باغباني جم

Name: سينا باغباني جم
Author: سينا باغباني جم

چكيده به لاتين

Generative adversarial networks have been at the center of attention in recent years, and considerable improvements have been made in both the structure of the network and its applications. One of such applications is image synthesis, a major part of which is face image synthesis. Despite breathtaking improvements in the field of face image synthesis utilizing generative adversarial networks, there still exists a lack of knowledge regarding how the networks work and how they learn. In addition, high computational costs lower accessibility and prevent editing and further training of these networks. Usually, the addition of other networks alongside the primary network is used as a means of creating editability features, which also requires high computational costs. Furthermore, qualitative improvements of the synthesized images have always been pursued by researchers in this field. To this end, by dissecting generative adversarial networks trained on face datasets, segmenting the classes within, and identifying units that sense concepts understandable to humans, we tried to gain a better understanding of how generative adversarial networks work and learn concepts. Afterward, by utilizing the identified units in the previous section, the ability to edit face images in pre-trained networks and the possible connection between editing physical concepts and editing abstract concepts are studied. Also, the effect of the edits in the final image has been investigated by making changes in different parts of the featuremap in the lower layer and measuring their presence in the final image. Finally, efforts have been made to improve the quality of synthesized images by identifying and removing problematic units from the network. To this end, 22 problematic units have been identified and removed from the Progressive GAN trained on the CelebA-HQ dataset. This improvement was measured by the FID criterion, which decreased from 39.10 in the main network to 28.74 in the improved one. Also, for better comparison, an ordinary and an expert person were asked to view 100 image pairs including an image from the main network and an image from the enhanced network, and select the better image. In this experiment, 53 images out of 100 selected images by the ordinary person, and 57 images out of 100 selected images by the expert person belonged to the improved network, which indicates the superiority of the improved network. In addition, efforts have been made to improve the statistical distribution of synthesized images. For this purpose, 1000 images synthesized from the original network and 1000 images from the dataset are randomly selected and the classes inside these images are determined with the corresponding number of pixels by utilizing the segmentation network. Afterward, considering the differences found between the two sets, an attempt has been made to bring the distribution of these two groups of images closer together by decreasing and increasing the identified units related to classes with severe differences, and this attempt has been largely successful.

كليدواژه هاي فارسي

شبكه‌هاي مولد متقابل , داده‌هاي چهره , كالبدشكافي , ويرايش , بهبود

كليدواژه هاي لاتين

generative adversarial networks , face data , dissection , edit , improvement

Author

Sina Baghbanijam

SuperVisor

Dr. Seyed Aliasghar Beheshti Shirazi

لينک به اين مدرک

https://dl.iust.ac.ir/dl/search/default.aspx?Term=26770&Field=0&DTC=6