A novel multiplex cascade classifier for pedestrian detection

doi:10.1016/j.patrec.2013.04.015

Pattern Recognition Letters

Volume 34, Issue 14, 15 October 2013, Pages 1687-1693

https://doi.org/10.1016/j.patrec.2013.04.015 Get rights and content

Highlights

•
A novel multiplex classifier model is proposed for pedestrian detection.
•
A new fusion strategy is introduced by cascading classifiers.
•
The weighted linear regression model is introduced to train the weak classifiers in our model.
•
A structure table is introduced to label the foreground pixels by means of background differences.

Abstract

Reliable pedestrian detection is of great importance in visual surveillance. In this paper, we propose a novel multiplex classifier model, which is composed of two multiplex cascades parts: Haar-like cascade classifier and shapelet cascade classifier. The Haar-like cascade classifier filters out most of irrelevant image background, while the shapelet cascade classifier detects intensively head-shoulder features. The weighted linear regression model is introduced to train its weak classifiers. We also introduce a structure table to label the foreground pixels by means of background differences. The experimental results illustrate that our classifier model provides satisfying detection accuracy. In particular, our detection approach can also perform well for low resolution and relatively complicated backgrounds.

Introduction

Pedestrian detection is essential and significant in intelligent visual surveillance systems, as it can provide the fundamental information for semantic understanding (Dollár et al., 2012). Some promising applications are exhibited in different fields, such as video surveillance (Viola et al., 2005, Haritaoglu et al., 2000, Wang et al., 2012), video conferences (Yang et al., 1996) and driver assistant systems (Zhao and Thorpe, 2000, Gavrila and Munder, 2007, Geronimo et al., 2010). It has been rapidly developed in recent years (Gao et al., 2009, Cerri et al., 2010, Enzweiler and Gavrila, 2009, Hou and Pang, 2011, Dollár et al., 2009). Viola and Jones (2001) presented Haar-like classifier to rapidly detect objects using AdaBoost classifier cascades in conjunction with Haar-like features. Rather than using the intensity values of a pixel, these Haar-like features use the change in contrast values between adjacent rectangular groups of pixels. The contrast variances are used to determine relative light and dark areas. The adjacent pixel groups with a relative contrast variance form a Haar-like feature. Haar features can easily be scaled by increasing or decreasing the size of the pixel group being detected. This allows these features to be used to detect objects of various sizes (Wilson and Fernandez, 2006). The main strength of Haar-like classifier is its fast detection. But for the complicated environments or images with noise, increasing its stage number cannot improve the detection accuracy but increase the computational cost. Usually the shape features are considered as the most important cue for pedestrian detection. Sabzmeydani and Mori (2007) introduced shapelet feature concept to describe local pieces of shape, which are formed by human features, such as head, shoulder, body. Their shapelet cascade classifier provided lower error rate for pedestrian detection. But it is usually time-consuming to analyze multi-level shapelet features.

In this paper, we propose a novel multiplex classifier model, which is composed of two parts: Haar-like cascade classifier and shapelet cascade classifier. The two classifiers are multiplex cascades. The Haar-like cascade classifier filters out most of irrelevant image background effectively, while the shapelet cascade classifier detect intensively head-shoulder features further. The weighted linear regression model is introduced to train the weak classifiers in our multiplex classifier model. We also introduce a structure table to label the foreground pixels of each frame of the video by means of background differences, which is helpful to decrease computational cost and to improve detection speed further.

The rest of the paper is organized as follows. Related works about pedestrian detection is given in Section 2. Cascade Adaboost classifier is introduced briefly and its performance is analyzed in Section 3. Our multiplex classifier model is proposed in Section 4. And its learning algorithm and foreground pixel labeling approach are both discussed in detail. Experiment results and analysis are presented in Section 5 and some conclusions are provided towards the end.

Section snippets

Related works

Classifier performance is most of important factors in pedestrian detection and thus it is paid more attentions (Xu et al., 2011, Cheng and Jhan, 2013). Freund and Schapire (1997) proposed AdaBoost, which is a classifier model through constructing a strong classifier as linear combination of simple weak classifiers. In the classifier model, its subsequent classifiers can be tweaked in favor of those instances misclassified by previous classifiers. But AdaBoost is sensitive to noisy data and

Cascade Adaboost classifier

Viola and Jones (2001) proposed AdaBoost cascade classifier, which is composed by cascading a number of Adaboost classifiers. Every stage of the cascade classifier either rejects the window or passes it to the next stage. Only the last stage may finally accept a window, but rejection may happen at any stage (Wojnarski, 2007, Landesa-Vázquez and Alba-Castro, 2012). According to the cascade Adaboost classifier training theory, the probability that a rejected sample is a positive sample at stage K

Multiplex cascade classifier

In this section, we present our multiplex cascade classifier. Firstly, our classifier framework will be discussed. And then its learning algorithm will be illustrated for the proposed multiplex classifier. Considering the detection time and cost, we further proposed the foreground pixels labeling algorithm to improve the image search spaces.

Experiment results and discussions

To illustrate the effectiveness and performance of the proposed classifier model, we consider different instances as follows. Some of them were collected from pedestrian detection benchmark datasets,¹^,² and others were obtained by our own video surveillance system. In our experiments, the single cascade classifier and simple cascade classifier were used to compare the performance with our multiplex cascade classifier.

Conclusions

In this paper, we presented a multiplex classifier model, which is composed of two multiplex cascades parts: Haar-like cascade classifier and shapelet cascade classifier. The Haar-like cascade classifier filters out most of irrelevant image background, while the shapelet cascade classifier detects intensively head-shoulder features. The weighted linear regression model was introduced to train its weak classifiers. We further introduced a structure table to label the foreground pixels by means

Acknowledgments

The authors thank Lixia Wang and Xiaoqing Ding for their scientific collaboration in this research work. This work is partly supported by the National Natural Science Foundation of China (Grant Nos. 61074029, 61173035), the Program for New Century Excellent Talents in University (Grant No. NCET-11-0861), and the Natural Science Foundation of Liaoning Province (Grant No. 20102014).

References (32)

W. Cheng et al.
A self-constructing cascade classifier with adaboost and svm for pedestrian detection
Engineering Applications of Artificial Intelligence
(2013)
Y. Freund et al.
A decision-theoretic generalization of on-line learning and an application to boosting
Journal of Computer and System Sciences
(1997)
I. Landesa-Vázquez et al.
Shedding light on the asymmetric learning capability of adaboost
Pattern Recognition Letters
(2012)
J. Wang et al.
On pedestrian detection and tracking in infrared videos
Pattern Recognition Letters
(2012)
S. Belongie et al.
Shape matching and object recognition using shape contexts
IEEE Transactions on Pattern Analysis and Machine Intelligence
(2002)
P. Cerri et al.
Day and night pedestrian detection using cascade adaboost system
N. Dalal et al.
Histograms of oriented gradients for human detection
P. Dollár et al.
Pedestrian detection: a benchmark
P. Dollár et al.
Pedestrian detection: an evaluation of the state of the art
IEEE Transactions on Pattern Analysis and Machine Intelligence
(2012)
M. Enzweiler et al.
Monocular pedestrian detection: survey and experiments
IEEE Transactions on Pattern Analysis and Machine Intelligence
(2009)

P. Felzenszwalb et al.

Cascade object detection with deformable part models

W. Gao et al.

Adaptive contour features in oriented granular space for human detection and segmentation

D. Gavrila et al.

Multi-cue pedestrian detection and tracking from a moving vehicle

International Journal of Computer Vision

(2007)

D. Geronimo et al.

Survey of pedestrian detection for advanced driver assistance systems

IEEE Transactions on Pattern Analysis and Machine Intelligence

(2010)

I. Haritaoglu et al.

W4: real-time surveillance of people and their activities

IEEE Transactions on Pattern Analysis and Machine Intelligence

(2000)

Y. Hou et al.

People counting and human detection in a challenging situation

IEEE Transactions on Systems, Man and Cybernetics, Part A: Systems and Humans

(2011)

Cited by (18)

Scene-dependent proposals for efficient person detection
2019, Pattern Recognition
Citation Excerpt :
In order to reduce the cost of feature extraction in the pyramid, Dollár et al. extended the ICF into the Aggregate Channel Feature (ACF) [15], where distinct channel features obtained from block of pixels are aggregated. Other researchers have proposed new classifier architectures that perform effective classification at a reduced computational cost [20–22,27,28]. Bourdev and Brandt proposed the Soft Cascade [20] where detections are evaluated at each node by also taking into account the weak-classifiers responses of the previous nodes.
In this paper, we present a new method that provides a substantial speed-up of person detection while showing high classification accuracy. Our method learns a Gaussian Mixture Model of locations and scales of the persons in the scene under observation. The model is learnt in an unsupervised way from a set of detections extracted from a small number of frames, so that each component of the mixture represents the expectation of finding a target in a region of the image at a specific scale. At runtime, the windows that most likely contain a person are sampled from the components and evaluated by the classifier. Experimental results show that replacing the classic sliding window approach with our scene-dependent proposals in state of the art person detectors allows us to drastically reduce the computational complexity while granting equal or higher performance in terms of accuracy.
Comparison of 2D image models in segmentation performance for 3D laser point clouds
2017, Neurocomputing
Citation Excerpt :
Moreover, among 2D image models, BA image is a proper choice for data acquired with a fixed laser scanner while FOBA image is more suitable for those captured with a moving laser scanner. The usage of 2D image models is not limited in scene segmentation [19,20]. Various features can be extracted from the 2D images converted from 3D laser point clouds and can also be utilized in outdoor scene classification and understanding with some learning algorithms [21–23].
The selection of a suitable representing model for 3D laser point clouds plays a significant role in 3D outdoor scene understanding. In this paper, we compare the segmentation performance of four types of models which can transform 3D laser point clouds into 2D images. In these models, fast optimal bearing-angle (FOBA) image is a novel 2D image model, which provides a general way to project 3D laser point clouds into 2D images. A series of segmentation performance tests and data analysis for these models are conducted in four datasets, which are acquired with different laser scanning modes. According to the experimental results, we argue that 2D image models greatly reduce the time cost of scene segmentation with a little loss of accuracy. Moreover, the usage of 2D image models is not limited in scene segmentation since robust features can be extracted from 2D image models to accomplish laser point classification and scene understanding.
Progressive subspace ensemble learning
2016, Pattern Recognition
Citation Excerpt :
For example, Rasheed et al. [27] used the classifier ensemble approach for electromyographic signal decomposition. Tian et al. [28] designed a classifier ensemble consisting of Haar-like and shapelet components for pedestrian detection. Guo et al. [29] proposed a two-stage pedestrian detection algorithm using AdaBoost and SVM.
There are not many classifier ensemble approaches which investigate the data sample space and the feature space at the same time, and this multi-pronged approach will be helpful for constructing more powerful learning models. For example, the AdaBoost approach only investigates the data sample space, while the random subspace technique only focuses on the feature space. To address this limitation, we propose the progressive subspace ensemble learning approach (PSEL) which takes into account the data sample space and the feature space at the same time. Specifically, PSEL first adopts the random subspace technique to generate a set of subspaces. Then, a progressive selection process based on new cost functions that incorporate current and long-term information to select the classifiers sequentially will be introduced. Finally, a weighted voting scheme is used to summarize the predicted labels and obtain the final result. We also adopt a number of non-parametric tests to compare PSEL and its competitors over multiple datasets. The results of the experiments show that PSEL works well on most of the real datasets, and outperforms a number of state-of-the-art classifier ensemble approaches.
An effective learning strategy for cascaded object detection
2016, Information Sciences
Citation Excerpt :
This concept, however, cannot be applied to all the categories of “things” present in the images. Several real-world applications [2,7,26,29,30] deal with objects that are not distinguishable from the background since they are not neatly different from their surroundings and are not unique within the image. This situation can depend on many factors such as the characteristics of the employed sensor, the size of the objects or the resolution of the images at hand.
To distinguish objects from non-objects in images under computational constraints, a suitable solution is to employ a cascade detector that consists of a sequence of node classifiers with increasing discriminative power. However, among the millions of image patches generated from an input image, only very few contain the searched object. When trained on these highly unbalanced data sets, the node classifiers tend to have poor performance on the minority class. Thus, we propose a learning strategy aimed at maximizing the node classifiers ranking capability rather than their accuracy. We also provide an efficient implementation yielding the same time complexity of the original Viola–Jones cascade training. Experimental results on highly unbalanced real problems show that our approach is both efficient and effective when compared to other node training strategies for skewed classes.
Deep Learning-Based Algorithm for Recognizing Tennis Balls
2022, Applied Sciences (Switzerland)
Lightweight Fall Detection Algorithm Based on AlphaPose Optimization Model and ST-GCN
2022, Mathematical Problems in Engineering

View all citing articles on Scopus

View full text

A novel multiplex cascade classifier for pedestrian detection

Highlights

Abstract

Introduction

Section snippets

Related works

Cascade Adaboost classifier

Multiplex cascade classifier

Experiment results and discussions

Conclusions

Acknowledgments

Engineering Applications of Artificial Intelligence

Journal of Computer and System Sciences

Pattern Recognition Letters

Pattern Recognition Letters

Shape matching and object recognition using shape contexts

IEEE Transactions on Pattern Analysis and Machine Intelligence

Day and night pedestrian detection using cascade adaboost system

Histograms of oriented gradients for human detection

Pedestrian detection: a benchmark

Pedestrian detection: an evaluation of the state of the art

IEEE Transactions on Pattern Analysis and Machine Intelligence

Monocular pedestrian detection: survey and experiments

IEEE Transactions on Pattern Analysis and Machine Intelligence

Cascade object detection with deformable part models

Adaptive contour features in oriented granular space for human detection and segmentation

Multi-cue pedestrian detection and tracking from a moving vehicle

International Journal of Computer Vision

Survey of pedestrian detection for advanced driver assistance systems

IEEE Transactions on Pattern Analysis and Machine Intelligence

W4: real-time surveillance of people and their activities

IEEE Transactions on Pattern Analysis and Machine Intelligence

People counting and human detection in a challenging situation

IEEE Transactions on Systems, Man and Cybernetics, Part A: Systems and Humans