چكيده به لاتين
Keyword search, as an alternative for structured query languages, provides a simple and user-friendly interface for searching and retrieving information from the graph-structured database. In contrast to the classical retrieval methods in databases, keyword search preserves the user's abstraction from the database structure. Keyword queries are expressed as a set of keywords, and their answers are in the form of a set of connected structures that show the relationships between the queried keywords in the database. The simplicity of querying in this way of search has caused the complexity of working with the graph data has been postponed from the querying stage to the query processing stage. Therefore, answering keyword queries requires sophisticated textual and structural data processing. One of the major challenges in keyword query processing is to retrieve a query-related answer set, which generally requires a long processing time due to the large size of the set. In this thesis, some methods have been developed to retrieve the answers of queries with an emphasis on maintaining an approximate order of their final ranking. These methods, with an approximate estimate of the weight of uncompleted answers, attempt to retrieve superior answers before the other ones. Enumerating answers with an approximate order allows providing a set of top-k answers before retrieving the entire set of answers. These methods also increase the efficiency of the system by limiting the search space using the indexing, partitioning and pruning techniques. The second major challenge in keyword search is to determine the relevance degree of an answer which is in the form of subgraph to a textual query. The degree of this relationship depends on the textual content of the answer and its structural compactness. This challenge is rarely discussed in the literature, while the effectiveness of keyword search system depends entirely on the order of presented answers. In this thesis, the relevance degree of answers to the query is determined based on the modeling of answers and queries and calculating the similarity of these models. In the answer modeling, the structural characteristics of the answer along with the weight of queried keywords in each node to the attribute level are aggregated into a single model. This model is designed directly on the subgraphs and is able to maintain the local importance of the keywords. Query is also modeled in two simple and developed ways. A simple query model is estimated based on the user input keywords, while in the developed model, feedback information is used to develop queries and to provide a more accurate estimate of what the user looking for. The proposed systems in this study are designed in a general framework including data modeling, indexing the graph data, retrieving relevant answers, and ranking the answer list. The results of the experimental evaluation of these systems on three real-world datasets confirm the efficiency and effectiveness of these systems compared to the state-of-the-art systems in the field of keyword search.