ReviewA comprehensive and systematic review on classical and deep learning based region proposal algorithms
Introduction
In the last few years, region proposal has been used as a key method for performing various visual recognition tasks such as text extraction (Gómez and Karatzas, 2017, Nguyen et al., 2017), object detection in natural images (Girshick, 2015, Girshick et al., 2014, He et al., 2015), traffic sign recognition (Ku, Mozian, Lee, Harakeh, & Waslander, 2018), object detection in medical images (Akselrod-Ballin et al., 2016, Mansoor et al., 2019), instance semantic segmentation (Liang, Lin, Wei, Shen, Yang, & Yan, 2017), saliency object detection (Guo, Wang, Shen, Shao, Yang, Tao, et al., 2017), and object segmentation (Hariharan et al., 2014, Liu et al., 2018). According to the remarkable results from region-based methods on standard datasets such as ImageNet (Deng, Dong, Socher, Li, Li, & Fei-Fei, 2009), Pascal VOC (Everingham, Eslami, Van Gool, Williams, Winn, & Zisserman, 2015), and Microsoft COCO (Lin et al., 2014), region proposal can play a central role in visual recognition applications. As a matter of fact, region proposal is gaining research popularity, and numerous studies have developed and improved different region proposal algorithms.
The sliding window technique has widely been used in many applications (Dalal & Triggs, 2005). The results have also shown a considerable improvement in visual recognition problems (Felzenszwalb, Girshick, McAllester, & Ramanan, 2009). However, the sliding window technique processes millions of windows; therefore, their classification would be exceptionally inefficient. Regular grids, fixed scales, and fixed aspect ratios are considered to obtain a reduced set of windows. Although they decrease the number of regions, the search space still remains huge. Hence, it is essential to impose some constraints on the search space as an example branch and bound technique. Despite the significantly decreased number of regions and high efficiency for linear classifiers (Lampert, Blaschko, & Hofmann, 2009), those approaches failed to lower the computation time effectively.
A superior alternative is image segmentation. The first region-based example of object detection was proposed by Gu, Lim, Arbeláez, and Malik (2009), after whom, Endres and Hoeim introduced an independent algorithm for object detection (Endres & Hoiem, 2010). Arbeláez et al. developed a semantic segmentation algorithm in which hierarchical segmentation acquires the initial regions based on a contour detector (Arbeláez, Hariharan, Gu, Gupta, Bourdev, & Malik, 2012). The algorithm performed properly on semantic image segmentation, especially for a number of specific objects. The region-based approach is also known as the region proposal algorithm, which extracts an adequate number of regions from the image. It has been used in various computer vision problems. This method extracts a much smaller number of regions than the previous techniques. For instance, the sliding window technique evaluates more than one million windows, whereas the region proposal method extracts 100000 regions or fewer (Alexe et al., 2010, Carreira and Sminchisescu, 2011, Pont-Tuset et al., 2016, Uijlings et al., 2013). The efficiency of region proposal generation is another significant issue. The quality of region also matters because it shows the effectiveness of proposals in recognition problems. The number of produced regions in proposal algorithms is satisfactory. This intuitively facilitates use of powerful classifiers per region. In this case, the technique is able to improve performance and decrease the error rate. In other words, region proposal methods have achieved a much higher accuracy than the Deformable Parts Model (DPM) technique (Felzenszwalb, McAllester, & Ramanan, 2008) in visual recognition problems (Girshick et al., 2014). In addition, region proposal and deep learning techniques have shown outstanding results in computer vision. Moreover, there are deep learning-based approaches (e.g. You Only Look Once (YOLO) Redmon, Divvala, Girshick, & Farhadi, 2016 and YOLO9000 Redmon & Farhadi, 2017), which have efficiently been implemented without using region proposals and are suitable for real-time applications. Despite their high-speed computation, they are incomparable to region proposals in terms of accuracy.
According to the impressive performance of region proposal algorithms in computer vision applications, it is necessary to introduce its theory and analyze it. In addition, in recent years, the influence of deep learning techniques in the growth of region proposal is considerable. Region proposal is an acceptable approach in image processing and computer vision. Therefore, this article presents a comprehensive review of the recent progress of region proposal algorithm along with its theory and concepts. As stated early, region proposal is used as a practical approach for solving various computer vision applications. On the other hand, several review articles have been presented on computer vision problems. Although most review papers have pointed to the region-based approaches at computer vision tasks, including object detection and semantic segmentation, the authors have not fully investigated region proposal algorithms with details. Review papers, presented over the past five years, have been summarized in Table 1. As it can be seen, several region proposal algorithms have been briefly described. On the contrary, there is only one review extensively stating and analyzing region proposal algorithms; also, the authors have reviewed the existing works until 2015 (Hosang, Benenson, Dollár, & Schiele, 2015). They have conducted an in-depth analysis of 12 different algorithms along with their impact on object detection. Furthermore, Chavali et al. provided an extensive survey on classical region proposals and evaluation metrics (Chavali, Agrawal, Mahendru, & Batra, 2016). They reviewed algorithms up to 2016; however, they did not investigate deep learning-based algorithms. To the best of our knowledge, there is no detailed review about the current developments of region proposals and more specifically, the effectiveness of deep learning techniques on them. Therefore, presenting an overview paper that precisely and comprehensively illustrates region proposal algorithms, concepts, evaluation metrics, applications, challenges, and future directions seems necessary. Differently from Hosang et al. (2015) and Hosang, Benenson, and Schiele (2014), this paper introduces recent region proposal algorithms and their properties. It also tries to compare them and specifically present deep learning-based region proposals. We describe the advantages and disadvantages of region proposal generation algorithms. Furthermore, ranking and refinement algorithms presented for improving region proposals are explained. Moreover, several practical examples are pointed out. We generally intend to make this survey to guide readers and researchers to better understand the region proposal, its strength and weakness, and open problems. Our contributions are mostly: (i) Presenting a comprehensive classification of region proposal algorithms and explaining deep learning-based region proposals; (ii) Reviewing ranking and refinement algorithms; (iii) Pointing to a number of practical examples in various region-based computer vision tasks; and (iv) addressing existing open-problems in region proposals.
The structure of the paper is as follows: afterwards, a review of concepts and theory of the region proposal algorithms is provided in Section 2. The challenges of proposals, along with evaluation metrics, are also explained. Then, in Section 3, we classify the region proposal generation methods and review the existing works. Section 4 presents an overview of the ranking algorithms, and Section 5 refers to computer vision applications using region proposals. Next, we discuss and summarize different region proposals in Section 6, and future directions are also sketched. Finally, Section 7 concludes the paper.
Section snippets
Region proposal: Theory, challenge, and evaluation metrics
Region proposal is used as a preprocessing stage or even the key step in many computer vision issues. The algorithm extracts a pool of appropriate regions, which are likely to contain objects, from an image. The extracted regions are shown as a bounding box or a segmented candidate. In a different manner, we can divide region proposal methods into independent and specific classes. The specific-class region proposal method is adjusted to capture definite objects. It has shown satisfactory
Region proposal generation
This section provides an overview of region proposal algorithms based on the defined classification. Accordingly, classical and advanced methods have been developed for proposal generation. Advanced methods are based on deep learning-based techniques, especially CNNs, whereas classical methods employ low-level features. Classical methods are also divided into window scoring-based and segmented subcategories. Fig. 3 shows a proposed taxonomy for region proposals. Region proposals can be
Ranking algorithms
So far, two practical approaches have been introduced to improve the results of the region proposal algorithms. In the first approach, the ranking algorithm is used to rank proposals which can be performed in different ways. In the second approach, called the refinement method, the proposals are refined according to defined rules. It should be noted that window scoring and advanced methods generally use ranking techniques which are not considered as an independent unit. Moreover, some of
Applications
Region proposal algorithms are used in various computer vision tasks such as object detection, object segmentation, image labeling, and instance semantic segmentation, and have also been evaluated on different datasets. Meanwhile, many practical examples have shown satisfactory performance. In this study we reviewed 65 scientific articles where Fig. 11 shows the percentage of deep learning-based network architectures used in different region-based applications. These network architectures
Discussion and future directions
This section discusses challenges in region proposal algorithms, analyzes and summaries different categories in region proposals. Also, some future directions in this area are presented.
With the capability of region proposals, the region-based approach has been accepted as a beneficial strategy in visual recognition problems and computer vision tasks. This paper reviewed comprehensively region proposal algorithms based on classical and advanced categories. In the classical category, the window
Conclusion
This paper conducted a review on region proposal algorithms representing one of the most significant issues of the recent decade. More concretely, more than 60 different algorithms were reviewed in detail with presenting a classification of the region proposals, including classical and advanced categories. Based on low-level features, classical methods utilize bottom-up segmentation or the sliding-window technique, in which the classical region proposal methods were implemented on the CPU. Deep
Declaration of Competing Interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
References (172)
- et al.
A multi-strategy region proposal network
Expert Systems with Applications
(2018) - et al.
Boundary-aware box refinement for object proposal generation
Neurocomputing
(2017) - et al.
A survey on deep learning techniques for image and video semantic segmentation
Applied Soft Computing
(2018) - et al.
Textproposals: a text-specific selective search algorithm for word spotting in the wild
Pattern Recognition
(2017) - et al.
A brief survey on semantic segmentation with deep learning
Neurocomputing
(2020) - et al.
Weakly supervised instance segmentation using multi-stage erasing refinement and saliency-guided proposals ordering
Journal of Visual Communication and Image Representation
(2020) - et al.
Efficient and robust optic disc detection and fovea localization using region proposal network and cascaded network
Biomedical Signal Processing and Control
(2020) - et al.
Joint 3d proposal generation and object detection from view aggregation
- et al.
Survey on semantic segmentation using deep learning techniques
Neurocomputing
(2019) - et al.
Listnet-based object proposals ranking
Neurocomputing
(2017)
Slic superpixels compared to state-of-the-art superpixel methods
IEEE Transactions on Pattern Analysis and Machine Intelligence
A region based convolutional network for tumor detection and classification in breast mammography
What is an object?
Measuring the objectness of image windows
IEEE Transactions on Pattern Analysis and Machine Intelligence
Semantic segmentation using regions and parts
Contour detection and hierarchical image segmentation
IEEE Transactions on Pattern Analysis and Machine Intelligence
Region-based semantic segmentation with end-to-end training
Cpmc: Automatic object segmentation using constrained parametric min-cuts
IEEE Transactions on Pattern Analysis and Machine Intelligence
Rooted spanning superpixels
International Journal of Computer Vision
A comprehensive analysis of weakly-supervised semantic segmentation in different image domains
International Journal of Computer Vision
An enhanced region proposal network for object detection using deep learning method
PLoS One
R-cnn for small object detection
A survey of the four pillars for small object detection: Multiscale representation, contextual information, super-resolution, and region proposal
IEEE Transactions on Systems, Man, and Cybernetics: Systems
High-quality proposals for weakly supervised object detection
IEEE Transactions on Image Processing
N-rpn: Hard example learning for region proposal networks
Mean shift: A robust approach toward feature space analysis
IEEE Transactions on Pattern Analysis and Machine Intelligence
R-fcn: Object detection via region-based fully convolutional networks
Histograms of oriented gradients for human detection
Segmenting unknown 3d objects from real depth images using mask r-cnn trained on synthetic data
Imagenet: A large-scale hierarchical image database
Category independent object proposals
The pascal visual object classes challenge: A retrospective
International Journal of Computer Vision
Object detection with discriminatively trained part-based models
IEEE Transactions on Pattern Analysis and Machine Intelligence
Efficient graph-based image segmentation
International Journal of Computer Vision
A discriminatively trained, multiscale, deformable part model
Survey of recent progress in semantic image segmentation with cnns
Science China. Information Sciences
Understanding deep learning techniques for image segmentation
ACM Computing Surveys
Cited by (10)
A deep learning-based and adaptive region proposal algorithm for semantic segmentation
2024, Applied Soft ComputingDenoising and segmentation of brain image by proficient blended threshold and conserve edge scrutinize technique
2024, Computational IntelligenceAn Integrated XI-UNet for Accurate Retinal Vessel Segmentation
2023, Journal of Circuits, Systems and Computers