تهمينه عسگردون

عنوان

بررسي سميت دارو ها مبتني برخواص شيميايي،فيزيكي و ساختار مولكولي با استفاده از ابزار يادگيري ماشين با تمركز بر سرطانزايي

مقطع تحصيلي

كارشناسي ارشد

رشته تحصيلي

شيمي فيزيك

سال تحصيل

1404

تاريخ دفاع

1404/06/23

استاد راهنما

جناب اقاي دكتر مجيد هاشميان زاده

استاد مشاور

ندارم

دانشكده

شيمي

چكيده

سرطانزايي دارو ها يكي از چالش هاي اصلي در فرايند توسعه و ارزيابي ايمني تركيبات دارويي است كه مي تواند منجر به عوارض جانبي جدي و هزينه هاي بالاي ازمايش هاي باليني شود.در اين پژوهش با هدف پيش بيني حلاليت و سرطانزايي تركيبات شيميايي با استفاده از روش هاي يادگيري ماشين، 863 تركيب دارويي از پايگاه دادهPubChem انتخاب و فيچر هاي مولكولي كليدي ان ها مانند(وزن مولكولي، ضريب تقسيم، تعداد پيوند هاي هيدروژني، سطح قطبي و....استخراج شد. و متغير هدف سرطانزايي و حلاليت شبيه سازي شد.براي تحليل، از روش هاي پيشرفته انتخاب ويژگي استفاده شد تا هفت ويژگي برتر انتخاب شوند. سپس داده ها به دو مجموعه اموزشي و ازمايشي تقسيم شدند. چهارمدل يادگيري ماشين شامل جنگل تصادفي،تقويت گرادياني پيشرفته ،تقويت گرادياني ، درختان اضافي اموزش داده شد. عملكرد مدل هاي رگرسيون براي حلاليت با ضريب تعيين و ميانگين مربعات خطا و مدل هاي طبقه بندي براي سرطانزايي با معيار هايي مانند دقت، و مساحت زير منحني ROC-AUC ارزيابي گرديد. الگوريتم درختان اضافي با دقت اموزش 0.933 و دقت تست 0.875 به عنوان برترين مدل بررسي سرطانزايي انتخاب شد. علاوه بر اين، از روشShapley Additive Explanations براي تفسير اهميت ويژگي ها و تحليل وابستگي هاي مولكولي استفاده شد. اين رويكرد محاسباتي مي تواند به عنوان ابزاري كارامد براي غربالگري اوليه تركيبات دارويي عمل كند،هزينه هاي ازمايشگاهي را كاهش دهد و ايمني دارو ها را بهبود بخشد.

تاريخ ورود اطلاعات

1404/07/28

عنوان به انگليسي

investigating drug toxicity based on chemical an‎d physical properties an‎d molecular structure with machine learning focus on carnogencity

تاريخ بهره برداري

9/14/2026 12:00:00 AM

دانشجوي وارد كننده اطلاعات

تهمينه عسگردون

Name: تهمينه عسگردون
Author: تهمينه عسگردون

چكيده به لاتين

Carcinogenicity of drugs is one of the main challenges in the development an‎d safety eva‎luation process of pharmaceutical compounds, which can lead to serious side effects an‎d high costs of clinical trials. In this study, with the aim of predicting the solubility an‎d carcinogenicity of chemical compounds using machine learning methods, 863 drug compounds were selec‎ted from the PubChem database an‎d their key molecular features such as (molecular weight, partition coefficient, number of hydrogen bonds, polar surface area, etc.) were extracted. an‎d the target variable of carcinogenicity an‎d solubility was simulated. For analysis, advanced feature selec‎tion methods were used to selec‎t the top seven features. Then, the data were divided into two training an‎d test sets. Four machine learning models including Ran‎domForest, XGBoost, GradientBoosting, an‎d ExtraTrees were trained. The performance of regression models for solubility with coefficient of determination an‎d mean square error an‎d classification models for carcinogenicity were eva‎luated with criteria such as accuracy, an‎d area under the ROC-AUC curve. The Extra trees algorithm with a training accuracy of 0.933 an‎d a testing accuracy of 0.875 was selec‎ted as the best model for investigating carcinogenicity. In addition, This, SHAP method was used to interpret the significance of features an‎d analyze molecular dependencies. This computational approach can serve as an efficient tool for initial screening of drug compounds, reduce laboratory costs, an‎d improve drug safety.

كليدواژه هاي فارسي

سميت دارو , سرطانزايي دارو

كليدواژه هاي لاتين

drug toxicity , drug carcinogenicity

Author

Tahmineh Asgardoon

SuperVisor

Dr. Majid Hashemianzadeh

لينک به اين مدرک

https://dl.iust.ac.ir/dl/search/default.aspx?Term=33858&Field=0&DTC=6