هليا شمس جي

عنوان

تشخيص حمله تزريق داده نادرست بر روي سيستم سايبرفيزيكي تصفيه آب با استفاده از يادگيري ماشين

مقطع تحصيلي

كارشناسي ارشد

رشته تحصيلي

مهندسي برق- كنترل

سال تحصيل

1401

تاريخ دفاع

1403/11/30

استاد راهنما

محمدرضا جاهدمطلق

استاد مشاور

دانشكده

برق

چكيده

با افزايش اتصال تعداد زيادي دستگاه با عملكردها و اهداف مختلف در سيستم‌هاي سايبرفيزيكي به هم با استفاده از فن‌آوري‌هاي ارتباطات ديجيتال، سيستم‌هاي حياتي در برابر تهديدات سايبري مختلفي آسيب‌پذير شده‌اند. در اين ميان حملات تزريق داده‌هاي نادرست به ويژه خطرناك هستند چرا كه اين حملات با وارد كردن اطلاعات نادرست، يكپارچگي و قابليت اطمينان داده‌ها را به خطر مي‌اندازند و كوچكترين داده اشتباه مي‌تواند منجر به پيامدهاي فاجعه‌بار شود. اين پايان‌نامه به بررسي استفاده از الگوريتم‌ يادگيري ماشين جنگل انزواي تعميم‌يافته براي شناسايي اين حملات مي‌پردازد. اين الگوريتم يك الگوريتم بهبود يافته از جنگل انزواي استاندارد است كه با معرفي شيب‌ها و عرض از مبدأهاي تصادفي براي تقسيم‌بندي داده‌ها، عملكرد جنگل انزوا را بهبود داده و مناسب داده‌هاي با ابعاد بالا مي‌باشد. براي تقويت بيشتر قابليت تشخيص، از روش انتخاب ويژگي مبتني بر انزوا استفاده شده است. اين روش از اصول جنگل انزوا براي انتخاب ويژگي‌هايي استفاده مي‌كند كه بهترين جداسازي را بين داده‌هاي عادي و ناهنجار فراهم مي‌كند. با محاسبه امتياز ناترازي و اعمال جريمه‌اي براي اين امتياز، روش انتخاب ويژگي مبتني بر انزوا ويژگي‌هاي مرتبطي كه فرآيند تشخيص ناهنجاري را بهبود مي‌بخشند شناسايي مي‌كند. ادغام اين روش با جنگل انزواي تعميم يافته تضمين مي‌كند كه ويژگي‌هاي انتخاب شده نه تنها جداسازي خوبي را فراهم مي‌كنند بلكه عملكرد مدل در شناسايي را نيز بهبود مي‌بخشند. علاوه بر اين، داده‌افزايي به منظور افزايش دقت و قدرت مدل يادگيري ماشين مورد استفاده قرار گرفته است. اين روش‌ها شامل توليد داده‌هايي مصنوعي است كه توزيع داده‌هاي عادي را تقليد مي‌كنند و در نتيجه مجموعه آموزشي جامع‌تري براي مدل يادگيري ماشين فراهم مي‌كنند. داده‌‌افزايي به مدل كمك مي‌كند تا بهتر بين داده‌هاي عادي و ناهنجار تمايز قائل شود و قابليت‌ شناسايي آن را بهبود بخشد. در انتها نتايج ارزيابي نشان داد كه روش پيشنهادي مي‌تواند به طور دقيق و كارآمد حملات تزريق داده نادرست را شناسايي كند و معيارهاي ارزيابي را بهبود بخشد. اين پايان نامه به حوزه امنيت سايبري در سيستم‌هاي سايبرفيزيكي با ارائه يك روش نوين و موثر براي شناسايي حملات تزريق داده‌هاي نادرست كمك مي‌كند و قابليت دفاعي سيستم‌هاي حياتي در برابر اين تهديدات را تقويت مي‌كند.

تاريخ ورود اطلاعات

1404/04/23

عنوان به انگليسي

Detection of False Data Injection Attack on Water Treatment Cyber-Physical System Using Machine Learning

تاريخ بهره برداري

1/1/1900 12:00:00 AM

دانشجوي وارد كننده اطلاعات

هليا شمس جي

Name: هليا شمس جي
Author: هليا شمس جي

چكيده به لاتين

Nowadays the critical cyber-physical systems are more vulnerable to various cyber threats, among which false data injection attacks are particularly insidious. These attacks compromise the integrity an‎d reliability of data by introducing false information, leading to potentially catastrophic consequences by even a small change in data. This thesis investigates the application of Extended Isolation Forest machine learning algorithm to detect these attacks. This algorithm is an enhancement of the stan‎dard Isolation Forest, which improves the performance of the algorithm by introducing ran‎dom slopes an‎d intercepts for branching cuts, making it more effective in high-dimensional data scenarios. To further enhance the detection capability, the Isolation-Based Feature selec‎tion approach was employed. This approach uses the principles of the Isolation Forest to selec‎t features that best separate the normal an‎d anomalous data. By calculating the imbalance score an‎d applying an imbalance penalty, the Isolation Based Feature selec‎tion method identifies the most relevant features that improve the anomaly detection process. The integration of Extended Isolation Forest with this approach ensures that the selec‎ted features not only provide good separability but also enhances the modelʹs performance in identifying attacks. Furthermore data augmentation techniques were utilized to enhance the robustness an‎d precision of the detection model. These techniques involved creating synthetic data points that mimic the distribution of normal data, thereby providing a more comprehensive training set for the machine learning model. The augmented data helps the model better distinguish between normal an‎d anomalous data points, improving its detection capabilities. At last the eva‎luation results demonstrated that the proposed approach could accurately an‎d efficiently detect false data injection attacks. The integration of data augmentation techniques further enhanced the modelʹs robustness, reducing the risk of false positives an‎d false negatives. This research contributes to the field of cybersecurity by providing a novel an‎d effective method for detecting false data injection attacks, thereby strengthening the defenses of critical systems against such threats.

كليدواژه هاي فارسي

جنگل انزواي تعميم يافته , داده‌افزايي , انتخاب ويژگي مبتني بر انزوا , تزريق داده نادرست

كليدواژه هاي لاتين

Extended Isolation Forest , Data Augmentation , Isolation Based Feature selec‎tion , False Data Injection

Author

Helia ShamsJey

SuperVisor

Mohammadreza Jahed Motlagh

لينک به اين مدرک

https://dl.iust.ac.ir/dl/search/default.aspx?Term=33507&Field=0&DTC=6