چكيده به لاتين
Abstract:
In this thesis, we have investigated the instability issue in decision tree learning algorithms and the
causes of it. We have proposed a general abstract decision tree learning algorithm to induce more
stable decision trees in comparison with traditional decision trees. In addition, we have proposed
detailed algorithms including two batch algorithms to learn decision trees from static data and one
incremental algorithm to learn decision trees from stream data. As there was no definition for the
stability in the stream context, in this thesis, we have illustrated the working space by resolving
the confusions in defining the stability for incremental decision tree learning algorithms. Although
several references have declared that there is strong instability in the batch decision tree learning
algorithms, but this issue has not been investigated for the incremental learning algorithms in the
stream context. In this thesis, we have illustrated the presence of the instability issue in the incremental
decision tree learning algorithms, theoretically and experimentally. To improve structural
stability, i. e. to minimize the sensitivity of the decision tree structure to the training instances has
been our focus in this thesis. The key solution of this thesis to improve the structural stability of
decision trees, is to use non-monolithic split tests based on multiple attributes, designed with the aim
of eliminating the competition between the attributes with close merits, localizing the effect of the
training instances on the split test and, making the split test trainable. In the proposed algorithms of
this thesis, the fuzzy min-max neural networks are employed as the split tests, in a way that provides
the desired attributes. In the stream context, the proposed algorithm (which also had contributions
in adapting Min-Max neural networks with concept drift) not only presents a good balance between
stability and flexibility, but also provides advantages such as efficient adaptability with concept drift
and new emerging classes. Theoretical analysis and experimental evidence show that the the proposed
algorithms not only provide higher structural stability in comparison with available decision
tree learning algorithms, but also create smaller and shallower models because of non-linearly splitting
the feature space at the internal nodes and in the meanwhile, they present comparable precision
and efficiency.
Keywords: Decision tree, instability, Data Stream, Fuzzy min-max neural network, Concept drift,
Classification.