Voting based extreme learning machine
Introduction
During past several decades, a lot of methods have been developed for classifications. The most representative approaches are the traditional Bayesian decision theory [8], [21], support vector machine (SVM) and its variants [12], [25], artificial neural network (ANN) and its variants [9], [10], [11], [24], [27], [33], fuzzy method and its variants [5], [13], [20], [23], [36], etc. Among these methods, ANN provides several particular characteristics. First, by incorporating certain learning algorithms to change the network structure and parameters based on external or internal information that flows through the network, a neural network can adaptively fit to the data without any explicit specifications of the underlying model. Second, neural networks have the universal approximation characteristic [10], [11]. It can be used to approximate any function to arbitrary accuracy. As in classification applications, the general procedure is to construct a functional relationship between several given attributes of an object and its class label. Therefore, the universal approximation feature can make the neural network to be an efficient classification tool. Finally, the changeable network structure and nonlinear basic computing neurons make neural networks flexible in modeling the complex functional relationship of real world applications.
Recently, a least square based learning algorithm named extreme learning machine (ELM) [14] was developed for single hidden layer feedforward networks (SLFNs). Using random computational nodes which are independent of training samples, ELM has several promising features. It is a tuning free algorithm and learns much faster than traditional gradient-based approaches, such as Back-Propagation [9] algorithm (BP) and Levenberg–Marquardt [24], [27] algorithm. Moreover, ELM tends to reach the small norm of network output weights as Bartlett’s theory [1] states that for feedforward neural networks reaching small training error, the smaller the norm of weights is, the better generalization performance the network tends to have. It has been further shown [15], [16], [17], [18] that many types of computational hidden nodes which may not be neuron alike nodes can be used in ELM as long as they are piecewise nonlinear, such as radial basis function (RBF) hidden nodes [16], fully complex nodes [18], wavelets [3], [4], etc. A number of real world applications based on ELM have been done in recent years [19], [34], [35].
ELM performs classification by mapping the signal label to a high dimensional vector and transforming the classification task to a multi-output function regression problem. An issue with ELM is that as the hidden node learning parameters in ELM are randomly assigned and they remain unchanged during the training procedure, the classification boundary may not be an optimal one. Some samples may be misclassified by ELM, especially for those which are near the classification boundary.
To reduce the number of such misclassified samples, we propose in this paper an improved algorithm called voting based extreme learning machine (in short, as V-ELM). The main idea in V-ELM is to perform multiple independent ELM training instead of a single ELM training, and then make the final decision based on the majority voting method [29]. Compared with the original ELM algorithm, the proposed V-ELM is able not only to enhance the classification performance and reduce the number of misclassified samples, but also to lower the variance among different realizations. Simulations on many real world classification datasets demonstrate that V-ELM outperforms several recent methods in general, including the original ELM [14], support vector machine (SVM) [12], optimally pruned extreme learning machine (OP-ELM) [28], Back-Propagation algorithm (BP) [9], [24], [27], K nearest neighbors algorithm (KNN) [2], [7], robust fuzzy relational classifier (RFRC) [5], radial basis function neural network (RBFNN) [33] and multiobjective simultaneous learning framework (MSCC) [6].
The organization of this paper is as follows. In Section 2, we first briefly review the basic concept of ELM. Then, we analyze an issue with ELM in classification applications, and present the V-ELM. Simulation results and comparisons are provided in Section 3. In Section 4, discussions on the performance of V-ELM with respect to different independent training numbers and on three recent methods [22], [26], [32] which also exploit multiple classifiers in ELM are given. Conclusions are drawn in Section 5. An appendix is given at the end to illustrate the three propositions presented in Section 2.
Section snippets
Voting based extreme learning machine
In this section, we first review the basic concept of the ELM algorithm for SLFNs in Section 2.1. Then, we analyze an issue that may exist in ELM when performing classification applications in Section 2.2. Finally, the new proposed V-ELM algorithm will be presented in Section 2.3 to tackle the issue and enhance the classification performance of ELM.
Performance verification
In this section, the performance of the proposed V-ELM is compared with the original ELM [14], SVM [12] and several other recent classification methods, including OP-ELM [28], Back-Propagation algorithm (BP) [9], [24], [27], K nearest neighbors algorithm (KNN) [2], [7], RFRC [5], RBFNN [33] and MSCC [6]. Simulations are conducted on many real world classification datasets. All experiments on ELM, V-ELM, SVM and OP-ELM are carried in the Matlab 7.4 environment running on an ordinary PC with 2.66
Discussions
In this section, we first discuss the performance of V-ELM with respect to different independent training numbers K. Then, we briefly discuss three recent methods [22], [26], [32] which also exploit multiple classifiers in ELM.
It was stated in Proposition 2.1 that for the V-ELM to work properly, a sufficiently large independent training number K is required. However, it may not be practically feasible to use V-ELM if the required K is too large, say well above 100 as the computational time of
Conclusions
In this paper, we have proposed an improved classification algorithm named voting based extreme leaning machine to train SLFNs. Based on the simulation results, comparisons and discussions, we have the following conclusions:
- (1)
Compared with the original ELM algorithm, the incorporation of the voting method enables V-ELM to achieve a much higher success classification rate in general. However, the improvement of performance of V-ELM is at the price of increasing the training time about K folds of
Acknowledgments
The authors thank the anonymous reviewers whose insightful and helpful comments greatly improved this paper.
References (36)
- et al.
Composite function wavelet neural networks with extreme learning machine
Neurocomputing
(2010) - et al.
Robust fuzzy relational classifier incorporating the soft class labels
Pattern Recognition Letters
(2007) Approximation capabilities of multilayer feedforward networks
Neural Networks
(1991)- et al.
Multilayer feedforward networks are universal approximators
Neural Networks
(1989) Finding useful fuzzy concepts for pattern classification using genetic algorithm
Information Sciences
(2005)- et al.
Extreme learning machine: theory and applications
Neurocomputing
(2006) - et al.
Incremental extreme learning machine with fully complex hidden nodes
Neurocomputing
(2008) - et al.
Convex incremental extreme learning machine
Neurocomputing
(2007) - et al.
Enhanced random search based incremental extreme learning machine
Neurocomputing
(2008) - et al.
Support vector machines with genetic fuzzy feature transformation for biomedical data classification
Information Sciences
(2007)
Ensemble of online sequential extreme learning machine
Neurocomputing
A self-constructing fuzzy CMAC model and its applications
Information Sciences
Rough set based 1-v-1 and 1-v-r approaches to support vector machine multi-classification
Information Sciences
The sample complexity of pattern classification with neural networks: the size of the weights is more important than the size of the network
IEEE Transactions Information Theory
“Output-sensitive algorithms for computing nearest-neighbor decision boundaries
Discrete and Computational Geometry
Composite function wavelet neural networks with differential evolution and extreme learning machine
Neural Processing Letters
A multiobjective simultaneous learning framework for clustering and classification
IEEE Transactions on Neural Networks
Nearest neighbor pattern classification
IEEE Transactions on Information Theory
Cited by (317)
Databases and computational methods for the identification of piRNA-related molecules: A survey
2024, Computational and Structural Biotechnology JournalA hybrid clustering-based type-2 adaptive neuro-fuzzy forecasting model for smart control systems
2024, Expert Systems with Applications