چكيده به لاتين
In recent decades, the increasing prevalence of metabolic syndrome has become a challenge for global health systems. The economic and clinical burden of this syndrome imposes huge direct and indirect costs on governments and patients annually. In recent years, the application of machine learning algorithms in the field of public health particularly for the prediction and diagnosis of diseases has garnered significant attention. The primary objective of this study is to predict metabolic syndrome in individuals by presenting an analytical approach based on the CRISP-DM methodology (Cross-Industry Standard Process for Data Mining), utilizing ensemble learning algorithms and comparing their performance in predicting and diagnosing metabolic syndrome as well as identifying the most important risk factors.
To conduct this research, a dataset representing the U.S. population was employed, which included not only laboratory and clinical measurements but also demographic information (such as sex, race, marital status, and income). Following the data preprocessing phase, the implementation of individual and ensemble machine learning algorithms, and the tuning of optimal parameters through Randomized Search, the results indicated that the Gradient Boosting ensemble algorithm achieved the highest predictive performance with an accuracy of 92.6%, surpassing the results reported in previous studies by 3.2%.
Feature importance analysis revealed that the most influential risk factors for developing metabolic syndrome were elevated blood glucose levels, high triglycerides, increased waist circumference (abdominal obesity), and low HDL cholesterol. Additionally, variables such as the urine albumin-to-creatinine ratio, age, sex, and income also played a significant role. In the descriptive analysis section, high-risk groups were identified, and the potential influence of demographic variables, mental health status, and predominant dietary patterns across different subpopulations on disease development was briefly explored.
The findings of this research not only enable more accurate prediction of metabolic syndrome and identification of high-risk groups but also contribute to enhancing preventive strategies, facilitating early diagnosis, and optimizing healthcare system resources to benefit patients, governments, and public health organizations. Early diagnosis of metabolic syndrome can prevent the occurrence of cardiovascular diseases, diabetes, and kidney failure, lead to reduced hospitalization rates and long-term medical costs, and manage the health systemʹs financial resources more efficiently.