چكيده به لاتين
Abstract
With the growing growth of social networks and the role they play in data exchange, today we are faced with a huge amount of data. Due to the uncertainty of the network data due to reasons such as the lack of complete information by the user, the label of many users is not specified. Since the identification of data labels leads to the search for specific and valuable patterns in various applications such as marketing and business, mining methods, including classification, are needed to determine the label of these data. What is important in designing an automated classification is the database used in its phase. The existing educational database consists of labeled and unlabeled data. Collecting sample labels is usually hard, costly and time consuming. While collecting unlabeled data is relatively easy and inexpensive. In traditional classification methods, only labeled data sets were used as training data for classifications. But the distribution of unlabeled samples along with labeled samples improves the classification accuracy. Therefore, in the proposed mechanism for classification a graph-based semi- supervised method is used. One of the important algorithms for graph based semi- supervised algorithms is label propagation. The label propagation method is applied to a similarity graph, which uses only profile attributes to calculate the similarity of the two nodes. In the method presented in this mechanism, we use both structural and profile properties to construct the similarity graph. Another issue is the existence of heterogeneity in the label class. Which is in the real world, there are two types of communication between two individuals with similar and dissimilar labels, while in the previous method, they only consider the similar type. Therefore, the lack of attention to this will reduce the accuracy of the classification. To solve this challenge, for determining the effect of the transfer label in the proposed mechanism, based on the kind of forward-looking relationship matrix called the dependency matrix has been used. Where this matrix is calculated using two types of relationships.
Keywords: Classification semi-supervised, graph-based, structural characteristics, label propagation