Traffic sign detection and recognition using fully convolutional network guided proposals

doi:10.1016/j.neucom.2016.07.009

Neurocomputing

Volume 214, 19 November 2016, Pages 758-766

https://doi.org/10.1016/j.neucom.2016.07.009 Get rights and content

Abstract

Detecting and recognizing traffic signs is a hot topic in the field of computer vision with lots of applications, e.g., safe driving, path planning, robot navigation etc. We propose a novel framework with two deep learning components including fully convolutional network (FCN) guided traffic sign proposals and deep convolutional neural network (CNN) for object classification. Our core idea is to use CNN to classify traffic sign proposals to perform fast and accurate traffic sign detection and recognition. Due to the complexity of the traffic scene, we improve the state-of-the-art object proposal method, EdgeBox, by incorporating with a trained FCN. The FCN guided object proposals can produce more discriminative candidates, which help to make the whole detection system fast and accurate. In the experiments, we have evaluated the proposed method on publicly available traffic sign benchmark, Swedish Traffic Signs Dataset (STSD), and achieved the state-of-the-art results.

Introduction

Traffic scene analysis is a very important topic in computer vision and intelligent systems [1], [2], [3], [4], [5], [6]. Traffic signs are designed to inform drivers of the current condition and other important information of the road. They are rigid and simple shapes with eye-catching colors and the information they carry is easy to understand. However, accidents may still occur when drivers do not pay attention to a traffic sign in time. Hence, it is very important to design an automatic real-time driver assistance system to detect and recognize traffic signs.

Although the research community and industry in the field of computer vision have achieved significant progresses in traffic sign detection and recognition in the past decades, there are two main inevitable difficulties. One is poor image quality due to low resolution, motion blur and noise. The other is uncontrolled environmental factors including weather conditions, complex background, variable illumination, sign color fading, etc. Fig. 1 shows some difficult examples. These make the traffic sign detection and recognition task still an open problem.

In order to tackle these problems, we propose a novel and efficient method to detect and recognize traffic signs. Generally, traffic signs have a distinctive property that they always appear on the two sides of the road to be traveled. Taking advantage of this property, the proposed method extracts traffic sign proposals by the guidance of fully convolutional network, unlike traditional traffic sign detection and recognition methods, which usually start from color segmentation [7], [8], [9], [10], [11], shape detection [12], [13], [14], [15] or sliding-window scanning [16], [17] to find traffic sign regions. This can effectively reduce the search range of traffic signs, so as to reduce the number of proposals and improve the efficiency.

The pipeline of the proposed method for traffic sign detection is illustrated in Fig. 2, which is mainly composed of two parts. One is the proposal extraction stage guided by fully convolutional network [18] and EdgeBox [19]. The other is the traffic sign classification stage using convolutional neural network [20]. Given a scene image, coarse regions of traffic signs are generated by fully convolutional network firstly. Then, proposals of traffic signs are extracted by EdgeBox from the coarse sign regions. Finally, traffic signs are identified and false positives are eliminated by a trained convolutional neural network classifier and the optimal bounding boxes are retained by non-maximal suppression.

Different from the previous methods on both traffic sign detection and general object detection, a novel FCN guided object proposal method is proposed. In the case of traffic sign detection where pixel-level annotation is unavailable, we propose to train FCN using boundingbox-level annotation which is a typical weak supervision for the semantic segmentation method like FCN.

The proposed method is validated on a publicly available database: Swedish Traffic Signs Dataset (STSD) [21]. The experimental results demonstrate that the proposed method has a much higher detection rate than other methods while producing less false positives. In addition, the proposed method is not time consuming, which is potentially to be applied in real time applications by utilizing the computational power of GPU devices.

Object detection using proposal is popular in recent literatures, especially after R-CNN [22] series works have achieved the state-of-the-art object detection performance on PASCAL VOC [23] and ILSVRC detection task [24]. The proposed method follows R-CNN, but there are at least three major differences from R-CNN: (1) We extent the R-CNN in a new application, the challenging traffic sign detection problem. (2) We adopt a much faster object proposal method, EdgeBox, rather than the selective search method [25] which is more accurate but slower. (3) We proposed a novel FCN guided object proposal which is both fast and accurate.

In summary, the main contributions of this work can be concluded in three folds: first, this paper proposes a new framework for traffic sign detection and recognition based on object proposal, which works in a coarse-to-fine manner. Second, due to the guidance of heat maps of traffic signs that generated by fully convolutional networks, the searching area of signs is drastically narrowed. Third, an efficient multi-class CNN model is trained for sign recognition with the bootstrapping strategy. The proposed method achieves the state-of-the art performance on the traffic sign benchmark.

The remainder of this paper is organized as follows: Section 2 presents recent works related to the proposed method. Section 3 details the proposed method, including coarse sign region segmentation, proposal extraction and sign classification. In Section 4, the experimental results and discussions are presented. Finally, the conclusions and future works are given in Section 5.

Section snippets

Related work

Over the last decade, research in traffic sign detection and recognition has grown rapidly. A large number of novel ideas and effective methods have been proposed [7], [8], [9], [10], [12], [13], [16], [17], [21], [26], [27], [28], [29], [30], [31], [32], [33], [34], [35], [36], [37], [38], [39], [40]. Usually, the detection part hunts potential regions of traffic signs and the recognition part determines the category of traffic signs.

The previous traffic detection methods can be divided into

Approach

In this section, we present the details for the detection pipeline illustrated in Fig. 2. It includes two main components, the novel proposal extraction component and the CNN classification component with training and testing details.

Dataset and evaluation protocol

The proposed method is evaluated on a publicly available database: Swedish Traffic Signs Dataset (STSD) [21]. STSD consists of two sets (Set1 and Set2) that contain more than 20,000 frames in total recorded from Swedish highways and city roads and every fifth frame has been manually labeled. Part0 for each set is annotated (roughly 2000 images) while Part1–Part4 are not annotated. Some examples from STSD can be seen in Fig. 5. The annotations for each image present status, location, type, and

Conclusions

In this paper, we propose a novel and efficient method to detect and recognize traffic signs. The main contributions of this paper are the following: (1) We propose a new framework of traffic sign detection and recognition based on proposals by the guidance of fully convolutional network, which largely reduces the search area of traffic signs under the premise of ensuring the detection rate. (2) We extent the R-CNN in a new application, the challenging traffic sign detection and recognition

Yingying Zhu was born in 1988. She received the B.S. degree in electronics and information engineering from the Huazhong University of Science and Technology (HUST), Wuhan, China, in 2011. She is currently a Ph.D. student with the School of Electronic Information and Communications, HUST. Her research areas mainly include text/traffic sign detection and recognition in natural images.

References (56)

Z. Zhang et al.
View independent object classification by exploring scene consistency information for traffic scene surveillance
Neurocomputing
(2013)
Y. Xia et al.
Integrating 3d structure into traffic scene understanding with rgb-d data
Neurocomputing
(2015)
Q. Ling et al.
A background modeling and foreground segmentation approach based on the feedback of moving objects in traffic surveillance systems
Neurocomputing
(2014)
A. De la Escalera et al.
Traffic sign recognition and analysis for intelligent vehicles
Image Vis. Comput.
(2003)
A. Ruta et al.
Real-time traffic sign recognition from video by class-specific discriminative features
Pattern Recognit.
(2010)
J. Lillo-Castellano et al.
Traffic sign segmentation and classification using statistical learning methods
Neurocomputing
(2015)
H. Li et al.
A novel traffic sign detection method via color segmentation and robust shape matching
Neurocomputing
(2015)
G. Piccioli et al.
Robust method for road sign detection and recognition
Image Vis. Comput.
(1996)
X.W. Gao et al.
Recognition of traffic signs based on their colour and shape features extracted using human vision models
J. Vis. Commun. Image Represent.
(2006)
D. Cireşan et al.
Multi-column deep neural network for traffic sign classification
Neural Netw.
(2012)

F. Zaklouta et al.

Real-time traffic sign recognition in three stages

Robot. Auton. Syst.

(2014)

Z.-L. Sun et al.

Application of bw-elm model on traffic sign recognition

Neurocomputing

(2014)

F. Yang et al.

Dynamic texture recognition by aggregating spatial and temporal features via ensemble svms

Neurocomputing

(2016)

B. Shi et al.

Script identification in the wild via discriminative convolutional neural network

Pattern Recognit.

(2016)

Z. Zhang et al.

Practical camera calibration from moving objects for traffic scene surveillance

IEEE Trans. Circuits Syst. Video Technol.

(2013)

C. Yao, X. Bai, W. Liu, L. Latecki, Human detection using learned part alphabet and pose dictionary, in: Proceedings of...

C. Hu et al.

Learning discriminative pattern for real-time car brand recognition

IEEE Trans. Intell. Transp. Syst.

(2015)

A. De La Escalera et al.

Road traffic sign detection and classification

IEEE Trans. Ind. Electron.

(1997)

S. Maldonado-Bascón et al.

Road-sign detection and recognition based on support vector machines

IEEE Trans. Intell. Transp. Syst.

(2007)

M.Á. García-Garrido, M.Á. Sotelo, E. Martín-Gorostiza, Fast road sign detection using Hough transform for assisted...

G. Loy, N. Barnes, Fast shape-based road sign detection for a driver assistance system, in: 2004 Proceedings of...

X. Bai et al.

3d shape matching via two layer coding

IEEE Trans. Pattern Anal. Mach. Intell.

(2015)

I.M. Creusen, R.G. Wijnhoven, E. Herbschleb, P. De With, Color exploitation in hog-based traffic sign detection, in:...

X. Baró et al.

Traffic sign recognition using evolutionary adaboost detection and forest-ecoc classification

IEEE Trans. Intell. Transp. Syst.

(2009)

J. Long, E. Shelhamer, T. Darrell, Fully convolutional networks for semantic segmentation, in: Proceedings of CVPR,...

C.L. Zitnick, P. Dollár, Edge boxes: locating object proposals from edges, in: Proceedings of ECCV, Springer, Zurich,...

A. Krizhevsky, I. Sutskever, G.E. Hinton, Imagenet classification with deep convolutional neural networks, in: Advances...

F. Larsson, M. Felsberg, Using Fourier descriptors and spatial models for traffic sign recognition, in: Image Analysis,...

Cited by (209)

Deep learning in Transportation: Optimized driven deep residual networks for Arabic traffic sign recognition
2023, Alexandria Engineering Journal
Car manufacturers around the globe are in a race to design and build driverless cars. The concept of driverless is also being applied to any moving vehicle such as wheelchairs, golf cars, tourism carts in recreational parks, etc. To achieve this ambition, vehicles must be able to drive safely on streets stay within required lanes, sense moving objects, sense obstacles, and be able to read traffic signs that are permanent and even temporary signs. It will be a completely integrated system of the Internet of Things (IoT), Global Positioning System (GPS), Machine Learning (ML)/Deep Learning (DL), and Smart Technologies. A lot of work has been done on traffic sign recognition in the English language, but little has been done for Arabic traffic sign recognition. The concepts used for traffic sign recognition can also be applied to indoor signage, smart cities, supermarket labels, and others. In this paper, we propose two optimized Residual Network (ResNet) models (ResNet V1 and ResNet V2) for automatic traffic sign recognition using the Arabic Traffic Signs (ArTS) dataset. Additionally, the authors developed a new dataset specifically for Arabic Traffic Sign recognition consisting of 2,718 images taken from random places in the Eastern province of Saudi Arabia. The optimized proposed ResNet V1 model achieved the highest training and validation accuracies of 99.18% and 96.14%, respectively. It should be noted here that the authors accounted for both overfitting and underfitting in the proposed models. It is also important to note that the results achieved using the proposed models outperform similar methods proposed in the extant literature for the same dataset or similar-size dataset.
Modified rat swarm optimization with deep learning model for robust recycling object detection and classification
2023, Sustainable Energy Technologies and Assessments
Biomass residues consist of sewage effluents and sludges, crop wastes, non-recyclable municipal solid waste industrial and domestic greywater, and much more. Recycling is regarded as one such significant disposal technique. Considering the intelligent classification and detection of solid waste as an essential factor in consumption and recycling, multi-object solid waste identification and classification methodology on the basis of transfer learning (TL) are presented. This study develops a Modified Rat Swarm Optimization with Deep Learning for Robust Recycling Object Detection and Classification (MRSODL-RODC) model. Primarily, fully convolutional network (FCN) is applied for the identification of waste objects. Moreover, MRSO with deep belief network (DBN) model is applied for object detection process. The design of the MRSO algorithm for the DBN technique, showing the novelty of our work. The performance validation of the MRSODL-RODC model can be executed with the help of the benchmark dataset from Kaggle repository. The experimental outcomes demonstrated the better performance of the MRSODL-RODC model over recent approaches with higher accuracy of 99.20%.
Semi-supervised learning for shale image segmentation with fast normalized cut loss
2023, Geoenergy Science and Engineering
In analyzing the geological processes, the segmentation of scanning electron microscopy (SEM) images of geological samples is crucial but time-consuming because it needs to distinguish the boundaries for various mineral objects in natural rocks. To automate the segmentation, prior research has adopted supervised learning approaches that train convolutional neural networks (CNNs) using datasets consisting of images and labels. Although supervised learning techniques can produce a high accuracy for a substantial amount of data, the label preparation process is expensive and prone to mistakes because human experts must annotate millions of pixels for each image. To lessen the needs for labeling, in this work we investigated both unsupervised and semi-supervised approaches for fine-grained shale by developing a semi-supervised learning model, SU-Net, based on the U-Net architecture. We also proposed a novel algorithm to speed up the semi-supervised loss function in SU-Net using caching, skip zeros, and batching optimizations. Evaluation results demonstrate that SU-Net can achieve a higher accuracy than U-Net in the case of few labeled data. In addition, SU-Net can be trained as fast as U-Net and is 33 $\times$ faster than the state-of-the-art unsupervised model. Overall, SU-Net can benefit the geoscience community by demonstrating its utilities that shale images can be segmented with similar quality even when only limited labels are available.
A general data quality evaluation framework for dynamic response monitoring of long-span bridges
2023, Mechanical Systems and Signal Processing
The structural health monitoring system (SHM) of long-span bridges inevitably produces low-quality data. It is important to evaluate the data quality and screen out normal data. Most existing studies on data anomaly detection have focused on abnormal data with abnormal waveforms in time domain. Usually there are pseudo-normal data in the SHM system, which seems normal in time domain. However, the pseudo-normal data of bridge dynamic responses may cause wrong dynamic characteristics identification. In this study, the existing data anomaly detection was further expanded to data quality evaluation, and both obvious abnormal and pseudo-normal data are evaluated. A general data quality evaluation framework for dynamic response monitoring of long-span bridge structures was proposed. The time–frequency characteristics of monitoring data were obtained by continuous wavelet transform, and the data quality was classified and evaluated by constructing and training a convolutional neural network (CNN). The proposed framework was verified by using acceleration monitoring data of a cable-stayed bridge. The results reveal that this framework solves the problem that CNN cannot detect pseudo-normal data through time-domain characteristics, and improves the accuracy of data quality evaluation by using time–frequency information of the monitoring data. The acceleration monitoring data of a suspension bridge and a single pylon cable-stayed bridge were used to demonstrate the feasibility of the cross-object application of the proposed framework. The evaluation accuracy for the acceleration data quality of the suspension bridge exceeded 96%. After the reinforced training of a small number of pseudo-normal data samples, the evaluation accuracy for the acceleration data quality of the cables of the cable-stayed bridge reached 96.8%. The framework shows a good generalization capacity and robustness in cross-object applications.
Indian traffic sign detection and recognition using deep learning
2023, International Journal of Transportation Science and Technology
Citation Excerpt :
Majority of the TSDR work, carried out for traffic signs, outside India, i.e., dealt with a limited number of categories associated with Advanced Driver Assistance Systems (ADAS) (Timofte et al., 2011). Consequently, benchmarks related to TSDR have fewer categories of traffic sign, focused only on recognition of traffic sign (Zaklouta and Stanciulescu, 2012) and (Zhu et al., 2016a,b). It can be very difficult to detect many other categories of traffic signs with higher disparity in appearance, which are not included in the benchmarks.
Traffic signs play a crucial role in managing traffic on the road, disciplining the drivers, thereby preventing injury, property damage, and fatalities. Traffic sign management with automatic detection and recognition is very much part of any Intelligent Transportation System (ITS). In this era of self-driving vehicles, calls for automatic detection and recognition of traffic signs cannot be overstated. This paper presents a deep-learning-based autonomous scheme for cognizance of traffic signs in India. The automatic traffic sign detection and recognition was conceived on a Convolutional Neural Network (CNN)- Refined Mask R-CNN (RM R-CNN)-based end-to-end learning. The proffered concept was appraised via an innovative dataset comprised of 6480 images that constituted 7056 instances of Indian traffic signs grouped into 87 categories. We present several refinements to the Mask R-CNN model both in architecture and data augmentation. We have considered highly challenging Indian traffic sign categories which are not yet reported in previous works. The dataset for training and testing of the proposed model is obtained by capturing images in real-time on Indian roads. The evaluation results indicate lower than 3% error. Furthermore, RM R-CNN’s performance was compared with the conventional deep neural network architectures such as Fast R-CNN and Mask R-CNN. Our proposed model achieved precision of 97.08% which is higher than precision obtained by Mask R-CNN and Faster R-CNN models.
An efficient implementation of traffic signs recognition system using CNN
2023, Microprocessors and Microsystems
In the modern world, road traffic signs are vital for drivers safety. In fact, multistep traffic forecasting on road networks can help to avoid many problems on the streets. In this context, there are several methods which allow to achieve excellent results in the field of traffic signs recognition. Recently, Deep Convolutional Neural Network (CNN) have achieved excellent results in this area.
In this paper, CNN is used to develop a Traffic and Road Sign recognition system. The performance of the proposed architecture is measured using a novel dataset, namely the Tunisian traffic signs dataset. In addition, we minimize the number of layers in the LeNet network, lowering the number of parameters in the network to accelerate the computation. Our architecture was used with varying parameters in order to achieve the best recognition rates in uncontrolled environment including weather conditions, complex background, variable illumination, and sign color fading. The Experimental results show that the proposed CNN architecture achieved a significant accuracy, thus higher than those achieved in similar previous studies.

View all citing articles on Scopus

Chengquan Zhang was born in 1990. He received the B.S. degree in electronics and information engineering from the Huazhong University of Science and Technology (HUST), Wuhan, China in 2014. He is currently a Master student with the school of Electronic Information and Communications, HUST. His main research interests include image classification and scene text detection.

Duoyou Zhou was born in 1993. He received the B.S. degree in college of Information Science & Technology from the Hainan University (HNU), Haikou, China in 2015. He is currently a Master student with the school of Electronic Information and Communications, HUST. His main research interests include object segmentation and scene text detection.

Xinggang Wang is an Assistant Professor of School of Electronics Information and Communications of Huazhong University of Science and Technology. He received his Bachelors' degree in communication and information system and Ph.D. degree in Computer Vision both from Huazhong University of Science and Technology. From May 2010 to July 2011, he was with the Department of Computer and Information Science, Temple University, Philadelphia, PA, as a visiting scholar. From February 2013 to September 2013, he was with the University of California, Los Angeles, as a Visiting Graduate Researcher. He is a Reviewer of IEEE Transaction on Cybernetics, Pattern Recognition, Computer Vision and Image Understanding, Neurocomputing, CVPR, ICCV, ECCV, etc. His research interests include computer vision and machine learning.

Bai Xiang is currently a Professor with the School of Electronic Information and Communications, Huazhong University of Science and Technology (HUST). He received the B.S., M.S., and Ph.D. degrees from HUST in 2003, 2005, and 2009 respectively. His research interests include computer vision and pattern recognition, specifically including object recognition, shape analysis, scene text recognition and intelligent systems. He is now serving as the Editorial Board Member of Pattern Recognition Letters, Neurocomputing, Frontier of Computer Science.

Wenyu Liu was born in 1963. He received the B.S. degree in Computer Science from Tsinghua University, Beijing, China, in 1986, and the M.S. and Ph.D. degrees, both in Electronics and Information Engineering, from Huazhong University of Science & Technology (HUST), Wuhan, China, in 1991 and 2001, respectively. He is now a professor and associate dean of the School of Electronic Information and Communications, HUST. His current research areas include computer vision, multimedia, and sensor network. He is a senior member of IEEE.

View full text

Traffic sign detection and recognition using fully convolutional network guided proposals

Abstract

Introduction

Section snippets

Related work

Approach

Dataset and evaluation protocol

Conclusions

Neurocomputing

Neurocomputing

Neurocomputing

Image Vis. Comput.

Pattern Recognit.

Neurocomputing

Neurocomputing

Image Vis. Comput.

J. Vis. Commun. Image Represent.

Neural Netw.

Robot. Auton. Syst.

Neurocomputing

Neurocomputing

Pattern Recognit.

Practical camera calibration from moving objects for traffic scene surveillance

IEEE Trans. Circuits Syst. Video Technol.

Learning discriminative pattern for real-time car brand recognition

IEEE Trans. Intell. Transp. Syst.

Road traffic sign detection and classification

IEEE Trans. Ind. Electron.

Road-sign detection and recognition based on support vector machines

IEEE Trans. Intell. Transp. Syst.

3d shape matching via two layer coding

IEEE Trans. Pattern Anal. Mach. Intell.

Traffic sign recognition using evolutionary adaboost detection and forest-ecoc classification

IEEE Trans. Intell. Transp. Syst.