چكيده به لاتين
Knee osteoarthritis is a degenerative joint disease that has a profound impact on knee function
and the quality of life of patients. Given the lack of a definitive cure and the limitations of
current methods, which mainly focus on symptom reduction, early diagnosis of this disease is
considered crucial. This research aims to develop a comprehensive diagnostic system based on
machine learning, utilizing data extracted from the OAI project. The data used includes
demographic information, clinical data, serum and urine biomarkers, and medication history.
In the preprocessing stage, techniques such as data cleaning, missing value management,
normalization, and standardization have been applied. For dimensionality reduction and the
selection of key features, feature dimensionality reduction analysis along with feature selection
methods such as Recursive Feature Elimination (RFE) and Boruta have been used. Then,
various algorithms, including decision trees, support vector machines, gradient boosting
algorithms (such as Random Forest, XGBoost, LightGBM, and CatBoost), and multilayer
perceptrons (MLP), were trained and evaluated on different subsets of data
(clinical/demographic, biomarkers, and medication). The results show that the integration of
clinical and demographic data yielded the best diagnostic performance; such that the MLP,
CatBoost, and XGBoost models, by achieving very high F1-scores and AUC values (over 0.90
and 0.98, respectively), have proven their superiority. Biomarker data alone showed moderate
performance, and medication data alone was challenging, but their combination with
biomarkers showed slight improvement. Also, the impact of PCA on improving model
performance was positive in some cases and negative in others. This intelligent approach lays
the groundwork for accurate and early diagnosis of knee osteoarthritis and improved
therapeutic management and quality of life for patients.