چكيده به لاتين
The rapid advancement of artificial intelligence (AI) and the emergence of machine learning (ML) have profoundly influenced various industries, notably healthcare and medicine. In particular, the integration of AI with medical technologies has enhanced capabilities in the diagnosis, prognosis, and prediction of various cancers. Prostate cancer, one of the most common malignancies among men, often progresses asymptomatically in its early stages and predominantly affects middle-aged and elderly individuals, contributing significantly to cancer-related mortality. If malignant, the cancer may metastasize to other organs, such as the spine and pelvis, often resulting in pain, bone fractures, and motor impairments. This study proposes an ensemble learning-based model for the early prediction of prostate cancer using a classification approach. To achieve this, three datasets comprising clinical, imaging, lifestyle, and demographic features were utilized, including data from both suspected and confirmed prostate cancer cases. The modeling process involved comprehensive data preprocessing, model training, and hyperparameter tuning. In addition to conventional ML algorithms, two ensemble methods, Random Forest (RF) and Extremely Randomized Trees (ExtraTrees), were employed as base learners for ensemble construction. Following training, the best-performing models were selected based on classification accuracy. The results demonstrate that ensemble models consistently outperformed individual classifiers. Specifically, Extreme Gradient Boosting (XGBoost) achieved an accuracy of 82% on the first dataset, Adaptive Boosting (AdaBoost) reached 96.2% on the second dataset, and Stacking achieved 99.4% accuracy on the third dataset. Moreover, the ensemble methods of voting, bagging, and stacking showed superior overall performance across all datasets. The influence of prostate-specific antigen (PSA) was also analyzed, revealing a positive effect on predictive performance. Furthermore, association rule mining was applied to identify lifestyle-related patterns associated with prostate cancer. Using the Apriori algorithm with a minimum support of 0.1 and a minimum confidence of 0.4, frequent patterns and association rules were extracted. Final rules indicating prostate cancer presence in the consequent were selected, with the most significant rules achieving a support of 0.929 and a lift of 1.85 in the second dataset, and a support of 0.643 and a lift of 1.353 in the third dataset.