چكيده به لاتين
In 2019, COVID-19, a disease common to humans and animals, emerged in China. The virus rapidly spread to the extent that it became a global health challenge. Consequently, the production and development of COVID-19 vaccines became a top priority worldwide. The first COVID-19 vaccine was manufactured in late 2020, and widespread vaccination played a crucial role in reducing the spread of the COVID-19 virus globally. Considering the benefits of vaccination, some vaccines may have side effects, which, depending on various factors, can pose challenges for individuals with different medical conditions.
This study investigates the side effects of Moderna, Johnson & Johnson, and Pfizer vaccines using data from the adverse event reporting system supervised by the U.S. Food and Drug Administration. The data were categorized into severe, mild to moderate, and undetermined side effects based on information published by the World Health Organization and the Food and Drug Administration.
Statistical analyses revealed that adults and elderly individuals experienced the highest rates of severe and mild to moderate side effects. In this research, 70.3% of women and 29.7% of men received vaccination. Due to the imbalanced nature of the data, data balancing techniques such as SMOTE were employed. Two types of predictions were performed in this study: predicting side effects and predicting vaccine ineffectiveness. Classification models were used for both predictions.
In predicting side effects, initially, a 3-class model was implemented, but due to its complexity, a 2-class model distinguishing between severe and mild to moderate side effects is presented. The models used in this research include light gradient boosting, random forest, and logistic regression. According to the results, the light gradient boosting model outperformed the other two models with an accuracy of 76.0%. Therefore, individuals intending to receive the vaccine have a 76% chance of being alerted about the possibility of experiencing side effects. The second prediction, predicting vaccine ineffectiveness, was executed using extreme gradient boosting, random forest, and light gradient boosting models. The random forest model achieved the highest performance with an accuracy of 98.0%. To achieve better results and higher model performance in both predictions, a random search method was used to determine hyperparameters. All models were implemented using the Python programming language and machine learning libraries on the training and test datasets.