چكيده به لاتين
Nowadays, despite of ever-increasing of the data volume and process complexity, the need of new storage and processing infrastructures, are sensed. Therefore, traditional relational databases are not enough, so using the NoSQL databases are must. In other hand, the massive data volume and processes, has led to the emergence of concept of big data. The nature of the data being investigated is such that the volume of data is very large and also distributed on multiple servers. This is the reason of choosing NoSQL query-based database. In this type of databases, the table designing is based on requirements and queries, such that each query is stored in one table; thus there is no join between tables. Also for increasing the speed of queries, the related queries are gathered in one cluster by using appropriate clustering Algorithm. Indeed, queries are clustered based on similarity metrics like query entities, number of similar fields and etc.
The purpose of this thesis is offering a method for clustering the queries in such a way as to optimize the database design.