رامين قرباني

عنوان

تجزيه و تحليل داده هاي سلامت با استفاده از رويكرد داده‌‌كاوي در قالب يك مدل تركيبي ( مطالعه‌ي موردي: بيماران بخش مراقبت‌هاي ويژه)

مقطع تحصيلي

كارشناسي ارشد

رشته تحصيلي

بهينه‌سازي سيستم‌ها

تاريخ دفاع

1398/8/27

استاد راهنما

دكتر روزبه قوسي - دكتر احمد ماكويي

استاد مشاور

دكتر عليرضا آتشي

دانشكده

صنايع

چكيده

توسعه تجهيزات پزشكي و سطح مراقبت هاي بهداشتي باعث توليد مقدار زيادي از اطلاعات و داده‌‌ي پزشكي ارزشمند شده است. تلاش جهت بدست آوردن اطلاعات مفيد با توجه به پردازش داده‌هاي موجود امري بسيار مهم مي‌باشد كه مي‌تواند جان بسياري از بيماران را نجات دهد. امروزه پيش‌بيني وضعيت حيات بيماران در بخش مراقبت‌هاي ويژه بيمارستان‌ها به عنوان يكي از مهمترين موضوعات در زمينه تجزيه و تحليل داده‌هاي پزشكي مي‌باشد. پيش‌بيني دقيق وضعيت حيات بيماران بستري در بخش مراقبت هاي ويژه مي تواند كيفيت مراقبت را بهبود و هزينه‌هاي مربوطه را به شدت كاهش دهد. بنابراين، پيش‌بيني هرچه سريع‌تر آن در بيماران بسيار مهم است. لازم به ذكر است كه در طي چند دهه گذشته، چندين سيستم نمره‌گذاري و مدل‌هاي پيش‌بيني داده‌كاوي جهت پيش‌بيني وضعيت حيات بيماران در بخش مراقبت هاي ويژه ايجاد شده است. اين پژوهش به معرفي يك مدل پيش‌بيني داده‌كاوي جديد بر اساس روش تركيبي انباشته‌سازي و داده‌هاي نامتوازن براي بيماران بستري در بخش مراقبت‌هاي ويژه مي‌پردازد. استفاده از داده‌هاي نامتوازن منجر به پيش‌بيني‌هاي غير قابل قبول مي‌شود، بنابراين جهت متوازن ساختن داده‌ها از يكي از تكنيك‌هاي الگوريتم نمونه افزايي استفاده شده است. علاوه بر اين، اين پژوهش از الگوريتم جنگل تصادفي جهت تعيين ويژگي‌هاي مهم و ميزان تأثير هركدام استفاده مي‌كند. به منظور اعتبارسنجي عملكرد اجراي مدل‌هاي پياده‌سازي شده در اين تحقيق، اين پژوهش از دو روش اعتبارسنجي ساده و زيرمجموعه‌اي استفاده كرده است. لازم به ذكر است كه مدل جديد همراه با مدل‌هاي مختلف داده‌كاوي از جمله جنگل تصادفي، نزديك‌‌ترين همسايه، شبكه عصبي مصنوعي، گراديان تقويتي، ماشين بردار پشتيبان، درخت تصميم‌گيري، لجستيك رگرسيون و بيز ساده مقايسه مي‌شود. نتايج نشان مي‌دهد كه مدل معرفي شده در اين پژوهش، بهترين عملكرد پيش‌بيني وضعيت حيات بيماران را در ميان ساير مدل‌هاي اجرا شده دارد. همچنين آزمون آماري فريدمن جهت بررسي تفاوت معناداري بين مدل‌هاي پيش‌بيني و تعيين بهترين مدل از نظر عملكرد مورد استفاده قرار گرفته است. نتايج آزمون فريدمن ثابت مي‌كند كه مدل جديد معرفي شده نسبت به ساير مدل‌‌ها مؤثرتر مي‌باشد. مشخص كردن تأثير ويژگي‌ها در پيش‌بيني مدل‌ها يكي ديگر از اهداف اين پژوهش بوده است. از نتايج حاصل نيز مي‌توان به بهبود عملكرد مدل پيش‌بيني پس از حذف سه ويژگي كم اهميت اشاره كرد.

تاريخ ورود اطلاعات

1398/11/27

عنوان به انگليسي

Analysis of healthcare dataset using the data mining approach as an ensemble model (Case Study: Intensive Care Units patients)

تاريخ بهره برداري

11/17/2020 12:00:00 AM

دانشجوي وارد كننده اطلاعات

رامين قرباني

Name: رامين قرباني
Author: رامين قرباني

چكيده به لاتين

Due to the development of biomedical equipment and healthcare level, a considerable amount of data has been collected to be analyzed, especially in the Intensive Care Unit (ICU). Prediction of mortality in the Intensive Care Unit is considered as one of the most significant subjects in the healthcare data analysis section. A precise prediction of the mortality risk for patients in ICU could improve the quality of care and reduce costs at the earliest possible stage. Over the past several decades, numerous scoring systems and machine learning prediction models have been developed to predict the mortality in ICU. This paper attempts to introduce a new ensemble predictive machine learning model based on the Stacking ensemble method to produce an early mortality prediction model for a highly imbalanced dataset. The SMOTE method, which is one of the over-sampling techniques, is used to solve the imbalanced data problem. Moreover, the feature selection technique based on the feature importance method is executed in this paper. In order to validate the performance of the introduced model, this paper compares the new model with various machine learning models including Random Forest, K-nearest-neighbor, Artificial Neural Network, XG-boost, Support Vector Machine (Polynomial, Radial Basis Function, and Sigmoid kernels), Decision Tree, Logistic Regression, and Naïve Bayes. The achieved results using the 10-fold cross-validation and hold-out methods indicate that the new ensemble model has the best mortality prediction performance among all other implemented models, and the effectiveness of this model is demonstrated. Additionally, the Friedman test, as a statistical significance test, is applied to examine the differences between classifiers. The results of the Friedman test prove that the new ensemble model is more effective than the other classifiers. Furthermore, the results of feature importance confirm that by eliminating insignificant features, the performance of the proposed model would be increased.

لينک به اين مدرک

https://dl.iust.ac.ir/dl/search/default.aspx?Term=21711&Field=0&DTC=6