چكيده به لاتين
Classification Tasks are methods for classifying data. Graph classification tasks can either be unsupervised or supervised. Unsupervised methods classify graphs into a certain number of categories by similarity (distance similarity). In supervised classification, a classification model is contructed by learning from training data. In the training data, each graph has a target value or a class label (e.g., biochemical activity). Supervised methods are more fundamental from a technical point of view, because unsupervised learning problems can be solved by supervised methods via probabilistic modeling of latent class labels. Supervised methods for graph classification: graph kernels and graph similarity, which are similarity and feature-based respectively. Kernel methods, construct a prediction rule based on a similarity function between two objects. Graph similarity use subgraph patterns mining for graph classification. This method has less complexity and low run time. The researchs showed that in a classifier, same graph may not have similarity but have same discriminative subgraph. GBoost algorithm classifies graphs based on subgraphs mining methods. In recent years GBoost algoritm has some developing for graph data. In this paper we focused on accuracy of GBoost graph classification. We used GAIA algorithm for extracting subgraph patterns .in GBoost algorithm, Processing of graph mining executed by gSpan algoritm. gSpan algorithm used frequency measure for subgraph mining. Researchs showed that only using frequency measure for selecting subgraph mining, can’t be effective. If we just use of frequency measure, many negative subgraph patterns that can’t be mined for graph classification and we just used positive subgraph patterns. Refers to use of discriminative measure beside of frequency measure. Frequency and discirminative used in GAIA algorithm. Althought using of two measures, can’t guaranty to get all discriminative patterns, but gets more discriminative patterns compare to using just frequency measure. We recommended three methods for impoving GBoost algorithm. All of them have same processing of extracting subgraph patterns, but patterns selecting process of them are differenet. In the first recommended algorithm, we select k subgraph patterns for graph classification that k is number of positive graphs. In the second recommended algorithm we used all subgraph patterns for graph classification. Because we don’t lose any valueable subgraph patterns. In third recommended algorithm we used a new criterion to select k subgraph patterns for graph classification. We examine 6 chemical datasets for experimnet. The result of experiment showed all of three algorithm have more accuracy compare to GBoost algorithm and this recommended algorithm have comparable runtime respect to GBoost algorithm.