چكيده به لاتين
Axillary lymph node status is one of the substantial prognostic factors for patients with breast cancer. Axillary Lymph Node Dissection (ALND) and Sentinel Lymph Node Biopsy (SLNB) are the common methods to probe axillary lymph node status in breast cancer patients. However, they bring many complications, including lymphedema, pain, numbness, infection, and limited shoulder movement. In this study, a lymph node metastasis prediction model has been developed to help physicians to make informed decisions, and avoid unnecessary surgeries. the clinical and pathological information of 1999 breast cancer patients from Motamed Breast Cancer Research Institute have been used to develop the model. The Miss-Forest method is used to impute missing data. The Recursive Feature Elimination method is used to select important features. By examining various parameters of this method, the features obtained from the Logistic Regression estimator are selected as the final features for modeling. These features include reproductive status, menstrual regularity, history of other cancers, tumor size, pathology report, breast cancer subtype, presence of distant metastasis, neoadjuvant chemotherapy and tumor location. Various Machine Learning algorithms, including Decision Tree, Naive Bayes, K-Nearest Neighbor, Support Vector Machines, Logistic Regression, Linear Discriminant Analysis, Random Forest, Multilayer Perceptron, Boosting, Voting and Stacking Classifiers have been used for prediction. The best-performing model is a voting classifier, including the naïve Bayes model, XGBoost and multilayer perceptron. This model achieved a precision of 0.734, a recall of 0.707, a F1-score of 0.707 and an AUC of 0.786. This model, as a tool, can provide an early diagnostic strategy for lymph node metastasis in breast cancer patients, and help physicians and specialists in preoperative decision-making.