چكيده به لاتين
Early detection of breast cancer is of great importance, as it can help increase the chance of recovery and reduce the mortality rate. Given that breast cancer is one of the most common types of cancer among women, it is necessary to develop effective and accurate methods for its detection.
In this study, the effect of synthetic data on the accuracy of cancer detection in mammography images was investigated. For this purpose, two synthetic data generation methods were used, including Diffusion and StyleGAN models. After training five deep learning models, including ResNet18, ResNet34, ResNet152, EfficientNetB0, and MaxViT, with synthetic data along with the original data, and in the testing phase, these models were tested with the original data including 234 cancer images and 10,732 healthy images.
In the first stage, the models were trained and evaluated using the original data. In the second stage, the combination of the original data with the artificial data generated by the Diffusion model was used. Finally, the third step involved combining the original data with the synthetic data generated by StyleGAN. The results showed that adding synthetic data significantly contributed to the accuracy of the diagnosis, and combining the data with the Diffusion model showed a greater improvement than StyleGAN.
To analyze the results, the perturbation matrices were calculated and presented for each of the scenarios. Also, the results were displayed in a table using the evaluation criteria including accuracy, sensitivity, precision, and F1-score. In addition, the ROC chart was plotted along with the AUC values for each scenario, which indicated the better performance of the models using synthetic data.
This research shows that the use of synthetic data, especially through the Diffusion model, can significantly help improve the accuracy of cancer detection from mammography images. The results of this research can be used in the development and improvement of diagnostic methods in the future.