چكيده به لاتين
Increasing the amount of electronic text data is an important motive for finding efficient ways of exploring textual data and, in particular, documents retrieval. The primary purpose of document retrieval is to identify documents related to the user's needs in a database. In fact, document retrieval, which usually referred to information retrieval, is a creation of a list of documents that is related to the user's request. This is done by comparing the user's request with a list of textual content of documents within the system. Today, almost all of users use the document retrieval systems, although they may not refer to it by this name, but, for example, use it as a web-based search engine. Different models have been used and proposed for document retrieval, but the standard models used in document retrieval are based on a vector representation of documents. In vector space model, documents are represented in the form of vectors in vector spaces. Vectors can well represent quantitative representations of the meaning of words and can easily make comparable the meaning of words to us. In this model, documents and queries are considered as Bag of word.
In this thesis, after reviewing the methods of document retrieval, a new method based on the vector space model and the soft cosine similarity has been proposed to consider the relation and meanings of words and to improve the results. In the proposed method has tried to eliminate some of the problems and drawbacks of the vector-space model.