A comprehensive evaluation of random vector functional link networks
Introduction
Single layer feedforward neural networks (SLFN) have been widely applied to solve problems such as classification and regression because of their universal approximation capability [14], [17], [20], [31]. Conventional methods for training SLFN are back-propagation based learning algorithms [7], [10]. These iterative methods suffer from slow convergence, getting trapped in a local minimum and being sensitivity to learning rate setting. Random Vector Functional Link Networks (RVFL), shown in Fig. 1, which is a randomized version of the functional link neural network network [8], [25], shows that actual values of the weights from the input layer to hidden layer can be randomly generated in a suitable domain and kept fixed in the learning stage. Independently developed method in [35] also belongs to the family of randomized methods for training artificial neural networks with randomized input layer weights. This method [35] does not have direct links between the inputs and the outputs whereas RVFL has highly beneficial direct links.
RVFL was proposed in [28]. Learning and generalization characteristics of RVFL were discussed in [26]. In [17], Igelnik and Pao proved that the RVFL network is a universal approximator for a continuous function on a bounded finite dimensional set with a closed-form solution. From then on, RVFL has been employed to solve problems in diverse domains. A dynamic step-wise updating algorithm was proposed to update the output weights of the RVFL on-the-fly in [5] for both a new added pattern and a new added enhancement node. The RVFL network was investigated in [37] in the context of modeling and control. They [37] suggested to combine unsupervised placement of network nodes to the input data density with subsequent supervised or reinforcement learning of the linear parameters of the approximator. Modeling conditional probabilities with RVFL was reported in [15].
RVFL can also be combined with other learning methods. In [6], RVFL was combined with statistical hypothesis testing and self-organization of a number of enhancement nodes to generate a new learning system called a statistical self-organizing learning system (SSOLS) for remote sensing applications. In [16], expectation maximization was combined with RVFL to improve its performance. RVFL has also been investigated in ensemble learning framework. In [1], decorrelated RVFL ensemble was introduced based on the negative correlation learning. RVFL based multi-source data ensemble for clinker free lime content estimation in rotary kiln sintering processes can be found [21]. RVFL has also been widely applied to solve real-life problems. In [30], the authors reported the performance of a holistic-styled word-based approach to off-line recognition of English language script. Radial basis function neural net and RVFL were combined. Their approach, named as density-based random-vector functional-link net (DBRVFLN), was helpful in improving the performance of the word recognition. In [29], RVFL was used in MPEG-4 coder. In [38] RVFL was applied for pedestrian detection based on combination of multi-feature. In [39], RVFL was combined with Adaboost in the pedestrian detection system. In [23], the authors investigated the performance of hardware implementation methods for RVFL. In [34], distributed learning of RVFL was proposed where training data is distributed under a decentralized information structure.
Consider an RVFL as demonstrated in Fig. 1. As mentioned before, the weights aij from the input to the enhancement nodes are randomly generated such that the activation functions are not all saturated. Following the approach in [1], all the weights are generated with the a uniform distribution within in this work, where S is a scale factor to be determined during the parameter tuning stage for each dataset. For RVFL, only the output weights β need to the determined by solving the following problem: where P is the number of data samples, t is the target and d is the vector version of the concatenation of the original features as well as the random features.1 Directly solving the problem in Eq. (1) may lead to over-fitting. In practice, a regularization on the solution such as regularized least square or preference of the solution with smaller norm [3] can be adopted to obtain the solution. RVFL can be roughly divided into 2 classes based on the algorithm to obtain the output weights. One is iterative RVFL, which obtains the output weights in an iterative manner based on the gradient of the error function. The other one is closed-form based RVFL, which obtains the output weights in a single-step. The present work focuses on the closed-form based RVFL because of its efficiency. A straightforward solution within a single learning step can be achieved by the pseudo-inverse [17], [27], among which Moore–Penrose pseudoinverse, where D and T are the matrix versions of the features and targets by stacking the features and targets of all data samples, is most commonly used. Another alternative is the L2 norm regularized least square (or ridge regression), which solves the following problem: The solution is given by where λ is the regularization parameter to be tuned.
Though there are many RVFL variants in the literature, some core features of RVFL remain unchanged. In this work, We choose the closed-from based RVFL and the following issues are investigated by using 121 UCI datasets as done in [11].
- 1.
Effect of direct links from the input layer to the output layer.
- 2.
Effect of the bias in the output neuron.
- 3.
Performance of 6 commonly used activation functions as summarized in Table 1.
- 4.
Performance of Moore–Penrose pseudoinverse and ridge regression (or regularized least square solutions) for the computation of the output weights.
- 5.
Effect of range for randomly generated parameters in hidden neurons.
Issues in the above list are discussed in Section 2.3 while issue 5 is discussed in Section 2.5.
Section snippets
Datasets
All 121 datasets are from the UCI repository [22]. The details of the datasets are summarized in Table 2.
We follow the same procedure as in [11]. Randomized stratified sampling is employed to make sure one training and one test set are generated (each with 50% of the available patterns), where each class has the same number of training and test patterns. Parameter tuning is performed on this couple of sets to identify parameters with the best performance on the test set. There are two
Concluding remarks
In this work we presented extensive and comprehensive evaluation of variants of RVFL with closed-form solution by using 121 UCI datasets [11]. The conclusion of our investigations are as follows:
- 1.
the effect of the direct links from the input layer to the output layer. It turns out that the direct links lead to better performance than those without in all cases as seen in Table 3.
- 2.
the effect of the bias in the output layer. It turns out that the bias term in the output neurons only has mixed
Acknowledgment
The authors would like to thank the Guest Editors and the reviewers for their valuable comments. In particular, authors thank the managing Guest Editor Associate Professor Dianhui Wang for suggesting us to investigate the scaling of randomization. Results presented in Section 2.5 show overall performance enhancement due to tuning the scaling of randomization.
References (39)
- et al.
Fast decorrelated neural network ensembles with random weights
Inf. Sci.
(2014) - et al.
Multilayer feedforward networks are universal approximators
Neural Netw.
(1989) - et al.
Neural networks for predicting conditional probability densities: improved training scheme combining EM and RVFL
Neural Netw.
(1998) - et al.
Multilayer feedforward networks with a nonpolynomial activation function can approximate any function
Neural Netw.
(1993) - et al.
Learning and generalization characteristics of the random vector functional-link net
Neurocomputing
(1994) - et al.
The functional link net and learning optimal control
Neurocomputing
(1995) - et al.
Intelligent rate control for MPEG-4 coders
Eng. Appl. Artif. Intell.
(2000) - et al.
Unconstrained word-based approach for off-line script recognition using density-based random-vector functional-link net
Neurocomputing
(2000) - et al.
Distributed learning for random vector functional-link networks
Inf. Sci.
(2015) - et al.
Face recognition using kernel ridge regression
Proceedings of the 2007 IEEE Conference on Computer Vision and Pattern Recognition
(2007)
The sample complexity of pattern classification with neural networks: the size of the weights is more important than the size of the network
IEEE Trans. Inf. Theor.
Arcing classifier (with discussion and a rejoinder by the author)
The Ann. Stat.
A rapid learning and dynamic stepwise updating algorithm for flat neural networks and the application to time-series prediction
IEEE Trans. Syst. Man Cybern. Part B: Cybern.
A statistical self-organizing learning system for remote sensing classification
IEEE Trans. Geosci. Remote Sens.
Handwritten digit recognition with a back-propagation network
Advances in Neural Information Processing Systems
A comprehensive survey on functional link neural networks and an adaptive pso–bp learning for cflnn
Neural Comput. Appl.
Statistical comparisons of classifiers over multiple data sets
J. Mach. Learn. Res.
Neural network recognizer for hand-written zip code digits
Advances in Neural Information Processing Systems
Do we need hundreds of classifiers to solve real world classification problems?
J. Mach. Learn. Res.
Cited by (353)
A novel broad learning system integrated with restricted Boltzmann machine and echo state network for time series forecasting
2024, Engineering Applications of Artificial IntelligenceDomain-incremental learning without forgetting based on random vector functional link networks
2024, Pattern RecognitionShip order book forecasting by an ensemble deep parsimonious random vector functional link network
2024, Engineering Applications of Artificial IntelligencePrediction of California bearing ratio and modified proctor parameters using deep neural networks and multiple linear regression: A case study of granular soils
2024, Case Studies in Construction MaterialsDiagnosis of breast cancer using flexible pinball loss support vector machine
2024, Applied Soft Computing