محمد غلامرضائي

عنوان

به كارگيري و مقايسه تكنيك ها و ابزارهاي داده كاوي براي پيشگيري و كشف تقلب در بيمه هاي خودرو

مقطع تحصيلي

كارشناسي ارشد

رشته تحصيلي

مهندسي صنايع- سيستم‌هاي كلان اقتصادي و اجتماعي

تاريخ دفاع

1400/8/25

استاد راهنما

مهدي غضنفري

استاد مشاور

رسول نورالسناء

دانشكده

مهندسي صنايع

چكيده

صنعت بيمه به عنوان يكي از كليدي‌ترين صنايع، نقش مهم و اثرگذاري در كيفيت و سبك زندگي اقتصادي مردم يك كشور دارد. درصد قابل توجهي از سرمايه‌هاي جاري كشورها، هر ساله به اين صنعت اختصاص مي‌يابد تا در جهت جبران خسارات جاني و مالي مردم، هزينه گردد. در فرآيند‌هاي احراز خسارت‌هاي بيمه‌اي، افراد مختلفي با محوريت شخص زيان‌ديده درگير مي‌باشند و از سوي ديگر ماهيت پوياي راهكارهاي تقلب و كلاهبرداري بيمه‌اي امكان وقوع تقلب را در اين صنعت فراهم آورده است؛ از اين رو، شركت‌هاي بيمه همواره با ريسك تقلب‌هاي بيمه مواجه مي‌باشند، به طوري كه عدم توجه به اين ريسك‌ها، حتي مي‌تواند منجر به ورشكستگي شركت گردد. در سال‌هاي اخير با توسعه‌ي سيستم‌هاي مديريت اطلاعات و ذخيره‌سازي داده هاي مربوط به هر يك از موارد خسارت، امكان استفاده از رويكردهاي مبتني بر داده‌كاوي و هوش‌مصنوعي را جهت شناسايي و كشف تقلب‌ها، ميسر‌تر ساخته است. با وجود ارائه‌ي رويكردهاي مختلف توسط پژوهشگران در اين زمينه، همچنان امكان بررسي، ارزيابي و مقايسه‌ها برخي از روش‌هاي مغفول وجود دارد. با توجه به موارد بيان شده، در اين پژوهش هدف بررسي و ارزيابي برخي از روش‌هاي غيرجمعي و جمعي طبقه‌بندي جهت شناسايي موارد تقلب در صنعت بيمه‌ي اتومبيل مي‌باشد. بدين منظور روش‌هاي انفرادي درخت تصميم، ماشين بردار پشتيبان، رگرسيون لجستيك، كا-نزديكترين همسايگي و همچنين روش‌هاي جمعي آدابوست، لاجيت بوست و آدابوست، به علاوه‌ رويكرد يادگيري عميق مبتني بر شبكه پرسپترون چندلايه مورد بررسي قرار گرفته است. به منظور ارزيابي روش‌هاي اشاره شده، از داده‌هاي برچسب‌گذاري شده‌ي موارد خسارات بيمه‌ي اتومبيل شركت بيمه رازي استفاده شده است. از آنجايي كه داده‌هاي استخراجي، داراي عدم توازن طبقاتي مي‌باشند، از روش SMOTE جهت مواجهه با اين موضوع استفاده شده است و هر يك از روش‌هاي طبقه‌بندي در دو حالت استفاده از اين روش و بدون استفاده از آن مورد ارزيابي قرار گرفته است. در ميان الگوريتم‌هاي غيرجمعي، الگوريتم ماشين¬بردار پشتيبان بالاترين دقت را به خود اختصاص داده است. در ميان الگوريتم‌هاي يادگيري جمعي، الگوريتم‌هاي جنگل تصادفي، آدابوست و آريواس بوست، دقت برابري داشته‌اند. در ميان الگوريتم‌هاي غيرجمعي، الگوريتم ماشين بردار پشتيبان عملكرد بهتري را در معيار تشخيص‌پذيري نسبت به رگرسيون لجستيك و كا نزديكترين همسايگي داشته است. در ميان الگوريتم‌هاي يادگيري جمعي، الگوريتم‌هاي جنگل تصادفي، آدابوست و آريو اس بوست عملكرد يكساني را در اين معيار از خود نشان داده‌اند. الگوريتم يادگيري عميق، عملكرد ضعيف‌تري را در اين معيار نسبت به روش‌هاي يادگيري جمعي و ماشين¬بردار پشتيبان داشته است. در دو معيار F1-Score و G-Mean، الگوريتم‌هاي يادگيري جمعي ‌جنگل تصادفي، آدابوست و آريو اس بوست عملكرد بهتري را از خود نشان داده‌اند. اما در مجموع الگوريتم شبكه عصبي عميق عملكرد بهتري را نسبت به ساير الگوريتم‌ها از خود نشان داده است. در دو معيار F1-Score و G-Mean، عملكرد اين الگوريتم با روش‌هاي يادگيري جمعي مشابه بوده است و در تشخيص¬پذيري، عملكرد بهتري را نسبت به ساير روش‌ها داشته است.

تاريخ ورود اطلاعات

1400/12/24

عنوان به انگليسي

Using and Comparing Data Mining Techniques and Tools to Prevent and Detect Auto Insurance Fraud

تاريخ بهره برداري

11/16/2022 12:00:00 AM

دانشجوي وارد كننده اطلاعات

محمد غلامرضائي

Name: محمد غلامرضائي
Author: محمد غلامرضائي

چكيده به لاتين

As a key industry, the insurance industry plays an influential role in the economic lifestyle and quality of life of a country’s people. A considerable percentage of a country’s current assets are allocated to this industry every year to compensate for people’s financial damages and loss of life. A variety of individuals, emphasizing the injured individual, are involved in the process of establishing the insurance damages. On the other hand, the dynamic nature of the insurance fraud approaches facilitates committing fraud in this industry. Thus, the insurance companies are constantly at risk of insurance fraud such that disregarding this risk can even lead to their bankruptcy. In recent years, the development of data storage and information management systems pertinent to each case of damage provided the opportunity to used artificial intelligence and data mining-based approaches to detect various types of fraud. Despite providing disparate approaches in this regard by the researchers, there are several neglected methods that can be investigated and compared. In light of that, this research seeks to investigate and assess several ensemble and non-ensemble classification methods to detect fraud in the auto insurance industry. Therefore, the single methods of the decision tree, support vector machine (SVM), logistic regression, K- nearest neighbor, as well as ensemble methods of Adaboost, LogitBoost and Adaboost, and multilayer perceptron network-based deep learning method were examined. To assess the aforesaid methods, the labeled data of auto insurance damages of Razi Insurance company, Iran, were used. Considering the imbalanced classification of the extracted data, the SMOTE method was used to deal with this issue. Each of the classification methods was assessed once with and once without using this method. Among the non-ensemble algorithms, the SVM was the most accurate method. Among the ensemble learning algorithms, the random forest, Adaboost, and RUSBoost enjoyed equal accuracy. Concerning identification criteria, SVM performed better than logistic regression and K- nearest neighbor among the non-ensemble algorithms. With regard to ensemble learning algorithms, the random forest, Adaboost, and RUSboost algorithms demonstrated equal performance regarding this criterion. The deep-learning algorithm demonstrated poor performance in comparison to ensemble learning and SVM methods. Random forest, Adaboos, and RUSboost algorithms performed better concerning F1-Score and G-Mean criteria. However, the deep neural network algorithm had a better performance than other algorithms. The performance of this algorithm was similar to ensemble learning methods concerning F1-Score and G-Mean criteria. Moreover, it performed better in identification than other methods.

كليدواژه هاي فارسي

بيمه هاي اتومبيل , داده كاوي , كشف تقلب , علم داده

كليدواژه هاي لاتين

auto insurance , data mining , data science , fraud detection

Author

mohammad gholamrezaei

SuperVisor

mehdi ghazanfari

لينک به اين مدرک

https://dl.iust.ac.ir/dl/search/default.aspx?Term=26236&Field=0&DTC=6