Comparison of crisp and fuzzy classification trees using gini index impurity measure on simulated data
Loading...
Date
2015-02-09
Authors
Muchai, Eunice
Odongo, Leo
Journal Title
Journal ISSN
Volume Title
Publisher
European Scientific Institute
Abstract
Crisp classification trees have been used to model many situations
such as disease classification. With the introduction of fuzzy theory, fuzzy
classification trees are gaining popularity especially in data mining. Very
little work has been done in comparing crisp and fuzzy classification trees.
This paper compares crisp classification trees and fuzzy classification trees
using Gini index as the impurity measure. The objective is to determine
which of the two classification trees gives fewer errors of classification. The
data used consisted of two sets of observations from multivariate normal
distributions. The first set of data were from two 3-variate normal
populations with different mean vectors and common dispersion matrix.
From each of the two populations 5000 samples were generated. 1000
samples out of the 5000 were used to create the trees. The remaining 4000
samples from each population were used to test the trees. The second set of
data were from three 4-variate normal populations with different mean
vectors and common dispersion matrix. A similar sampling and testing
procedure as for the case of first set of data was employed. Computations
were implemented using R statistical package. The results from the test
showed that fuzzy classification trees allocated observations to the correct
population with fewer errors than did crisp classification tree.
Description
Keywords
Crisp classification tree, Fuzzy classification tree, Gini index, Fuzzy decision points, Crisp decision points
Citation
European Scientific Journal June 2014 edition vol.10, No.18