چكيده به لاتين
Finding similar texts for information systems is an important, necessary, and useful task. Due to the high volume of text output in different systems, it is not practicable to catalog and supply suitable keywords by humans. Due to the high volume of text output in different systems, it is not practicable to catalog and supply suitable keywords by humans. Finding similarities after understanding the conceptual structure of the text, or in other words understanding the ontological structure of the text, leads to a higher accuracy in producing results. In each domain, there are certain entities and identities that can help us to understanding the semantic of texts. Often these entities are related to each other. This connection sometimes occurs in the form of a synonym, specificity, connectivity, etc. The number of entities in each domain is usually finite and the relationship between those entities in the specified time intervals. Therefore, the use of these entities, which includes events, keywords, people, places, organizations, and so on, can be helpful in discovering related texts. This discovery is based on the ontological recognition of any text. Knowledge bases are a significant source of the ontological connection of entities. In Farsi texts, due to the existence of specific lexical and grammatical features, the discovery of entities, as well as the relationship between them, has its own complexity. In this research, we try to discover the relationship between texts with the help of entities. In this way, entities and their relationship are considered as a volatile, unidirectional graph. Then the entities of each text are considered as below the non-connected subgraph. Then the subgraph is expand with the help of the general entity graph. The similarity of the two subgraphs of the two texts is then calculated, and based on that, a quantity is obtained to measure the similarity of the text. The results of this proposed method are compared with human judgment data.