سيد مهدي حسني

عنوان

استخراج ارتباط هستي‌شناسانه متون فارسي به كمك موجوديت‌هاي مرتبط

مقطع تحصيلي

كارشناسي ارشد

رشته تحصيلي

كامپيوتر

سال تحصيل

1395-1398

تاريخ دفاع

1398/06/11

استاد راهنما

دكتر بهروز مينائي

دانشكده

كامپيوتر

چكيده

يافتن متن‌هاي مشابه براي سامانه‌هاي اطلاعاتي امري مهم، ضروري و پركاربرد است. با توجه به توليد حجم بالايي از متن‌ها در سامانه‌هاي مختلف امكان فهرست‌نويسي و تهيه كليدواژه‌هاي مناسب توسط انسان عملاً وجود ندارد. در سامانه‌هايي كه فهرست‌نويسي شده‌اند نيز باگذشت زمان و توليد موضوعات مهم جديدي كه قبلاً مهم نبوده‌اند، نياز به مكانيزه كردن بازيابي متن‌هاي مشابه امري اجتناب‌ناپذير است. يافتن تشابه در صورت درك ساختار مفهومي متن يا به عبارتي ساختار هستي‌شناسانه متن، سبب دقت بالاتري در توليد نتايج مي‌شود. در هر دامنه اطلاعاتي، موجوديت‌ها و هويت‌هاي خاصي وجود دارند كه به كمك آن‌ها مي‌توان متن‌ها را ازلحاظ مفهومي درك كرد. اغلب اين موجوديت‌ها با يكديگر ارتباط دارند. اين ارتباط گاهي به شكل مترادف، اعم و اخص، سنخيت و غيره بروز مي‌كنند. تعداد موجوديت‌هاي هر دامنه معمولاً شمارا و ارتباط بين آن موجوديت‌ها در بازه‌هاي زماني، مشخص و معين هست. لذا استفاده از اين موجوديت‌ها كه شامل وقايع، موضوعات مهم و كليدي، اشخاص، مكان‌ها، سازمان‌ها و غيره هست، كمك شاياني در كشف متن‌هاي مرتبط مي‌نمايد. اين كشف بر اساس شناخت هستي‌شناسانه هر متن هست. پايگاه‌هاي دانش، منبع قابل‌توجهي از ارتباط هستي‌شناسانه موجوديت‌ها مي‌باشند. در متن‌هاي زبان فارسي به دليل وجود ويژگي‌هاي خاص لغوي و دستور زبان، يافتن موجوديت‌ها و همچنين ارتباط بين آن‌ها پيچيدگي خاص خود را دارد. در اين تحقيق به كشف ارتباط بين متن‌ها به كمك موجوديت‌ها مي‌پردازيم. بدين شكل كه موجوديت‌ها و ارتباط بين آن‌ها به شكل يك گراف وزن‌دار، جهت‌دار غيرهمبند در نظر گرفته مي‌شود. سپس موجوديت‌هاي هر متن را به شكل زيرگراف غيرهمبند در نظر مي‌گيريم. اين زيرگراف به كمك گراف موجوديت‌هاي كلي، توسعه مي‌يابد. حال تشابه زيرگراف‌هاي توسعه‌يافته دو متن، محاسبه مي‌شود و بر اساس آن، كميتي براي اندازه‌گيري ميزان تشابه متن‌ها به دست مي‌آيد. درنهايت نتايج حاصل از اين روش پيشنهادي با داده‌هاي قضاوت شده انساني مورد مقايسه قرار مي‌گيرد.

تاريخ ورود اطلاعات

1398/08/18

عنوان به انگليسي

Ontological Relation Extraction from Persian Text by using Related Entities

تاريخ بهره برداري

9/2/2019 12:00:00 AM

دانشجوي وارد كننده اطلاعات

سيدمهدي حسني

Name: سيدمهدي حسني
Author: سيد مهدي حسني

چكيده به لاتين

Finding similar texts for information systems is an important, necessary, and useful task. Due to the high volume of text output in different systems, it is not practicable to catalog and supply suitable keywords by humans. Due to the high volume of text output in different systems, it is not practicable to catalog and supply suitable keywords by humans. Finding similarities after understanding the conceptual structure of the text, or in other words understanding the ontological structure of the text, leads to a higher accuracy in producing results. In each domain, there are certain entities and identities that can help us to understanding the semantic of texts. Often these entities are related to each other. This connection sometimes occurs in the form of a synonym, specificity, connectivity, etc. The number of entities in each domain is usually finite and the relationship between those entities in the specified time intervals. Therefore, the use of these entities, which includes events, keywords, people, places, organizations, and so on, can be helpful in discovering related texts. This discovery is based on the ontological recognition of any text. Knowledge bases are a significant source of the ontological connection of entities. In Farsi texts, due to the existence of specific lexical and grammatical features, the discovery of entities, as well as the relationship between them, has its own complexity. In this research, we try to discover the relationship between texts with the help of entities. In this way, entities and their relationship are considered as a volatile, unidirectional graph. Then the entities of each text are considered as below the non-connected subgraph. Then the subgraph is expand with the help of the general entity graph. The similarity of the two subgraphs of the two texts is then calculated, and based on that, a quantity is obtained to measure the similarity of the text. The results of this proposed method are compared with human judgment data.

لينک به اين مدرک

https://dl.iust.ac.ir/dl/search/default.aspx?Term=21296&Field=0&DTC=6