چكيده به لاتين
Due to the development of biomedical equipment and healthcare level, a considerable amount of data has been collected to be analyzed, especially in the Intensive Care Unit (ICU). Prediction of mortality in the Intensive Care Unit is considered as one of the most significant subjects in the healthcare data analysis section. A precise prediction of the mortality risk for patients in ICU could improve the quality of care and reduce costs at the earliest possible stage. Over the past several decades, numerous scoring systems and machine learning prediction models have been developed to predict the mortality in ICU. This paper attempts to introduce a new ensemble predictive machine learning model based on the Stacking ensemble method to produce an early mortality prediction model for a highly imbalanced dataset. The SMOTE method, which is one of the over-sampling techniques, is used to solve the imbalanced data problem. Moreover, the feature selection technique based on the feature importance method is executed in this paper. In order to validate the performance of the introduced model, this paper compares the new model with various machine learning models including Random Forest, K-nearest-neighbor, Artificial Neural Network, XG-boost, Support Vector Machine (Polynomial, Radial Basis Function, and Sigmoid kernels), Decision Tree, Logistic Regression, and Naïve Bayes. The achieved results using the 10-fold cross-validation and hold-out methods indicate that the new ensemble model has the best mortality prediction performance among all other implemented models, and the effectiveness of this model is demonstrated. Additionally, the Friedman test, as a statistical significance test, is applied to examine the differences between classifiers. The results of the Friedman test prove that the new ensemble model is more effective than the other classifiers. Furthermore, the results of feature importance confirm that by eliminating insignificant features, the performance of the proposed model would be increased.