چكيده به لاتين
Generative adversarial networks have been at the center of attention in recent years, and considerable improvements have been made in both the structure of the network and its applications. One of such applications is image synthesis, a major part of which is face image synthesis. Despite breathtaking improvements in the field of face image synthesis utilizing generative adversarial networks, there still exists a lack of knowledge regarding how the networks work and how they learn. In addition, high computational costs lower accessibility and prevent editing and further training of these networks. Usually, the addition of other networks alongside the primary network is used as a means of creating editability features, which also requires high computational costs. Furthermore, qualitative improvements of the synthesized images have always been pursued by researchers in this field.
To this end, by dissecting generative adversarial networks trained on face datasets, segmenting the classes within, and identifying units that sense concepts understandable to humans, we tried to gain a better understanding of how generative adversarial networks work and learn concepts.
Afterward, by utilizing the identified units in the previous section, the ability to edit face images in pre-trained networks and the possible connection between editing physical concepts and editing abstract concepts are studied. Also, the effect of the edits in the final image has been investigated by making changes in different parts of the featuremap in the lower layer and measuring their presence in the final image.
Finally, efforts have been made to improve the quality of synthesized images by identifying and removing problematic units from the network. To this end, 22 problematic units have been identified and removed from the Progressive GAN trained on the CelebA-HQ dataset. This improvement was measured by the FID criterion, which decreased from 39.10 in the main network to 28.74 in the improved one. Also, for better comparison, an ordinary and an expert person were asked to view 100 image pairs including an image from the main network and an image from the enhanced network, and select the better image. In this experiment, 53 images out of 100 selected images by the ordinary person, and 57 images out of 100 selected images by the expert person belonged to the improved network, which indicates the superiority of the improved network. In addition, efforts have been made to improve the statistical distribution of synthesized images. For this purpose, 1000 images synthesized from the original network and 1000 images from the dataset are randomly selected and the classes inside these images are determined with the corresponding number of pixels by utilizing the segmentation network. Afterward, considering the differences found between the two sets, an attempt has been made to bring the distribution of these two groups of images closer together by decreasing and increasing the identified units related to classes with severe differences, and this attempt has been largely successful.