A CNN-Based Approach for Automatic Building Detection and Recognition of Roof Types Using a Single Aerial Image

Alidoost, Fatemeh; Arefi, Hossein

doi:10.1007/s41064-018-0060-5

A CNN-Based Approach for Automatic Building Detection and Recognition of Roof Types Using a Single Aerial Image

Original Article
Published: 15 January 2019

Volume 86, pages 235–248, (2018)
Cite this article

PFG – Journal of Photogrammetry, Remote Sensing and Geoinformation Science Aims and scope Submit manuscript

1874 Accesses
28 Citations
1 Altmetric
Explore all metrics

Abstract

Automatic detection and reconstruction of buildings have become essential in many remote sensing and computer vision applications. In this paper, the capability of Convolutional Neural Networks (CNNs) is investigated for building detection as well as recognition of roof shapes using a single image. The major steps are including training dataset generation, model training, image segmentation, building detection and roof shape recognition. First, a CNN is trained for extracting urban objects such as trees, roads and buildings. Next, classification of different roof types into flat, gable and hip shapes is performed using the second trained CNN. The assessment results prove effectiveness of the proposed method with approximately 97% and 92% of quality rates in detection and recognition steps, respectively.

Zusammenfassung

Ein CNN-basierter Ansatz zur automatischen Erkennung von Gebäuden und Dachtypen in einem einzelnen Luftbild. Die automatische Erkennung und Rekonstruktion von Gebäuden ist bei vielen Anwendungen in Fernerkundung und Computer-Vision unerlässlich geworden. In diesem Beitrag wird die Fähigkeit von Convolutional Neural Networks (CNNs) zur Erkennung von Gebäuden und Dachformen in einem einzelnen Bild untersucht. Die wichtigsten Schritte sind die Erstellung von Trainingsdatensätzen, das Modelltraining, die Bildsegmentierung sowie die Gebäude- und Dachformerkennung. Zunächst wird ein CNN für das Extrahieren von städtischen Objekten wie Bäumen, Straßen und Gebäuden trainiert und der Datensatz klassifiziert. Anschließend erfolgt die Klassifizierung der Dächer in Flach-, Giebel- und Satteldach mit dem zweiten trainierten CNN. Die Ergebnisse belegen den Erfolg der vorgeschlagenen Methode mit ca. 97% bzw. 92% Klassifizierungsgenauigkeit bei Gebäudedetektion und Klassifizierung der Dachformen.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1

Towards open-source LOD2 modelling using convolutional neural networks

Article Open access 13 May 2021

Convolutional Neural Network for Detection of Building Contours Using Multisource Spatial Data

Roof Defect Segmentation on Aerial Images Using Neural Networks

Notes

https://github.com/loosgagnet/Building-detection-and-roof-type-recognition.

References

Alidoost F, Arefi H (2016) Knowledge based 3D building model recognition using convolutional neural networks from lidar and aerial imageries. Int Arch Photogramm Remote Sens Spat Inf Sci XLI-B3:833–840. https://doi.org/10.5194/isprsarchives-xli-b3-833-2016
Article Google Scholar
Awrangjeb M, Zhang C, Fraser CS (2013) Automatic extraction of building roofs using lidar data and multispectral imagery. ISPRS J Photogramm Remote Sens 83:1–18. https://doi.org/10.1016/j.isprsjprs.2013.05.006
Article Google Scholar
Ballard DH, Brown CM (1982) Computer vision. Prentice-Hall Inc, New Jersey
Google Scholar
Benedek C, Descombes X, Zerubia J (2012) Building development monitoring in multitemporal remotely sensed image pairs with stochastic birth-death dynamics. IEEE Trans Pattern Anal Mach Intell 34(1):33–50. https://doi.org/10.1109/TPAMI.2011.94
Article Google Scholar
Bengio Y (2009) Learning deep architectures for AI. Found Trends^® Mach Learn 2(1):1–127. https://doi.org/10.1561/2200000006.
Bengio Y (2012) Deep learning of representations for unsupervised and transfer learning. JMLR 27:17–37
Google Scholar
Chatfield K, Simoyan K, Vedaldi A, Zisserman A (2014) Return of the devil in the details: delving deep into convolutional nets. Proc B Mach Vision Conf. arXiv:1405.3531
Chen Y, Zhao X, Jia X et al (2015) Spectral-spatial classification of hyperspectral data based on deep belief network. IEEE J Sel Top Appl Earth Obs Remote Sens 8(6):2381–2392. https://doi.org/10.1109/JSTARS.2015.2388577
Article Google Scholar
Cheng L, Gong J, Li M, Liu Y (2011) 3D building model reconstruction from multi-view aerial imagery and lidar data. Photogramm Eng Remote Sens 77(2):125–139. https://doi.org/10.14358/PERS.77.2.125
Article Google Scholar
Cramer M (2010) The DGPF-test on digital airborne camera evaluation—overview and test design. Photogramm Fernerkundung Geoinf 2010:73–82. https://doi.org/10.1127/1432-8364/2010/0041
Article Google Scholar
Deng L, Yu D (2014) Deep learning: methods and applications. Found Trends® Signal Process 7(3–4):197–387. https://doi.org/10.1136/bmj.319.7209.0a
Article Google Scholar
Deng J, Dong W, Socher R et al (2009) ImageNet: a large-scale hierarchical image database. In: Proceedings of IEEE conference on computer vision and pattern recognition (CVPR2009). IEEE, Miami, FL, USA, pp 248–255. https://doi.org/10.1109/CVPR.2009.5206848
Donahue J, Jia Y, Vinyals O et al (2014) DeCAF: a deep convolutional activation feature for generic visual recognition. Proc 31st Int Conf Mach Learn, PMLR 32(1):647–655
Google Scholar
Dornaika F, Moujahid A, El Merabet Y, Ruichek Y (2016) Building detection from orthophotos using a machine learning approach: an empirical study on image segmentation and descriptors. Expert Syst Appl 58:130–142. https://doi.org/10.1016/j.eswa.2016.03.024
Article Google Scholar
Dorninger P, Pfeifer N (2008) A comprehensive automated 3D approach for building extraction, reconstruction, and regularization from airborne laser scanning point clouds. Sensors 8:7323–7343. https://doi.org/10.3390/s8117323
Article Google Scholar
Felzenszwalb PF, Huttenlocher DP (2004) Efficient graph-based image segmentation. Int J Comp Vision 59(2):167–181. https://doi.org/10.1023/B:VISI.0000022288.19776.77
Article Google Scholar
Gamba P, Houshmand B (2000) Digital surface models and building extraction: a comparison of IFSAR and LIDAR data. IEEE Trans Geosci Remote Sens 38(4):1959–1968. https://doi.org/10.1109/36.851777
Article Google Scholar
Ghaffarian S, Ghaffarian S (2014) Automatic building detection based on purposive FastICA (PFICA) algorithm using monocular high resolution google earth images. ISPRS J Photogramm Remote Sens 97:152–159. https://doi.org/10.1016/j.isprsjprs.2014.08.017
Article Google Scholar
Girshick R (2015) Fast R-CNN. In: Proceeding of IEEE conference on computer vision and pattern recognition (CVPR2014). IEEE, Santiago, Chile, pp 1440–1448. https://doi.org/10.1109/ICCV.2015.169
Girshick R, Donahue J, Darrell T, Malik J (2014) Rich feature hierarchies for accurate object detection and semantic segmentation. In: Proceedings of IEEE conference on computer vision and pattern recognition (CVPR2014). IEEE, Columbus, Ohio, pp 580–587. https://doi.org/10.1109/CVPR.2014.81
Girshick R, Donahue J, Darrell T, Malik J (2015) Region-based convolutional networks for accurate object detection and segmentation. IEEE Trans Pattern Anal Mach Intell 38(1):142–158. https://doi.org/10.1109/TPAMI.2015.2437384
Article Google Scholar
Guo L, Chehata N, Mallet C, Boukir S (2011) Relevance of airborne lidar and multispectral image data for urban scene classification using random forests. ISPRS J Photogramm Remote Sens 66:56–66. https://doi.org/10.1016/j.isprsjprs.2010.08.007
Article Google Scholar
Haala N, Brenner C (1999) Extraction of buildings and trees in urban environments. ISPRS J Photogramm Remote Sens 54:130–137. https://doi.org/10.1016/S0924-2716(99)00010-6
Article Google Scholar
He K, Gkioxari G, Dollár P, Girshick R (2017) Mask R-CNN. In: Proceedings of IEEE international conference on computer vision (ICCV2017). IEEE, Venice, Italy, pp 2980–2988. https://doi.org/10.1109/ICCV.2017.322
Hermosilla T, Ruiz LA, Recio JA, Estornell J (2011) Evaluation of automatic building detection approaches combining high resolution images and lidar data. Remote Sens 3:1188–1210. https://doi.org/10.3390/rs3061188
Article Google Scholar
Höfle B, Mücke W, Dutter M, Rutzinger M (2009) Detection of building regions using airborne lidar: a new combination of raster and point cloud based GIS methods. Proc Geoinformatics Forum Salzburg. pp 66–75. https://ezproxy2.utwente.nl/login?url=https://webapps.itc.utwente.nl/library/2009/chap/rutzinger_det.pdf. Accessed 15 Jan 2017
Huang J, Rathod V, Sun C et al (2017) Speed/accuracy trade-offs for modern convolutional object detectors. In: Proceedings of IEEE conference on computer vision and pattern recognition (CVPR2017). IEEE, Honolulu, HI, USA, pp 3296–3297. https://doi.org/10.1109/CVPR.2017.351
ISPRS (2012) Web site of the ISPRS test project on urban classification and 3D building reconstruction. Available at http://www2.isprs.org/commissions/comm3/wg4/detection-and-reconstruction.html. Accessed 17 Sep. 2016
Izadi M, Saeedi P (2012) Three-dimensional polygonal building model estimation from single satellite images. Geosci Remote Sens IEEE Trans 50(6):2254–2272. https://doi.org/10.1109/TGRS.2011.2172995
Article Google Scholar
Kabolizade M, Ebadi H, Ahmadi S (2010) An improved snake model for automatic extraction of buildings from urban aerial images and lidar data. Comput Environ Urban Syst 34:435–441. https://doi.org/10.1016/j.compenvurbsys.2010.04.006
Article Google Scholar
Karantzalos K, Koutsourakis P, Kalisperakis I, Grammatikopoulos L (2015) Model-based building detection from low-cost optical sensors onboard unmanned aerial vehicles. Int Arch Photogramm Remote Sens Spat Inf Sci 40:293–297. https://doi.org/10.5194/isprsarchives-xl-1-w4-293-2015
Article Google Scholar
Khurana M, Wadhwa V (2015) Automatic building detection using modified grab cut algorithm from high resolution satellite image. Int J Adv Res Comput Commun Eng 4(8):158–164. https://doi.org/10.17148/IJARCCE.2015.4833
Google Scholar
Kim K, Shan J (2011) Building roof modeling from airborne laser scanning data based on level set approach. ISPRS J Photogramm Remote Sens 66:484–497. https://doi.org/10.1016/j.isprsjprs.2011.02.007
Article Google Scholar
Krizhevsky A, Sutskever I, Geoffrey EH (2012) ImageNet classification with deep convolutional neural networks. Proc 25th Int Conf Neural Infor Proc Syst, NIPS’12 1:1097–1105. https://doi.org/10.1109/5.726791
Google Scholar
LeCun Y, Bengio Y (1995) Convolutional networks for images, speech, and time-series. In: Arbib MA (ed) The handbook of brain theory and neural networks. MIT Press
Li E, Femiani J, Xu S et al (2015) Robust rooftop extraction from visible band images using higher order CRF. IEEE Trans Geosci Remote Sens 53(8):4483–4495. https://doi.org/10.1109/TGRS.2015.2400462
Article Google Scholar
Liu T, Fang S, Zhao Y et al (2015) Implementation of training convolutional neural networks. arXiv:1506.01195
Liu W, Anguelov D, Erhan D et al (2016) SSD: single shot multibox detector. In: Leibe B, Matas J, Sebe N, Welling M (eds) Computer vision—ECCV 2016. ECCV 2016. Lecture notes in computer science. Springer, Cham
Google Scholar
Maas HG, Vosselman G (1999) Two algorithms for extracting building models from raw laser altimetry data. ISPRS J Photogramm Remote Sens 54:153–163. https://doi.org/10.1016/S0924-2716(99)00004-0
Article Google Scholar
Maitra DS, Bhattacharya U, Parui SK (2015) CNN based common approach to handwritten character recognition of multiple scripts. In: Proceedings of international conference on document analysis recognition (ICDAR2015). IEEE, Tunis, Tunisia, pp 1021–1025. https://doi.org/10.1109/icdar.2015.7333916
Makantasis K, Karantzalos K, Doulamis A, Doulamis N (2015) Deep supervised learning for hyperspectral data classification through convolutional neural networks. IEEE Int Geosci Remote Sens Symp 2015:4959–4962. https://doi.org/10.1109/IGARSS.2015.7326945
Google Scholar
Maltezos E, Ioannidis C (2015) Automatic detection of building points from lidar and dense image matching point clouds. ISPRS Ann Photogramm Remote Sens Spat Inf Sci II-3/W5:33–40. https://doi.org/10.5194/isprsannals-ii-3-w5-33-2015
Article Google Scholar
Manno-Kovacs A, Ok AO (2015) Building detection from monocular vhr images by integrated urban area knowledge. IEEE Geosci Remote Sens Lett 12(10):2140–2144. https://doi.org/10.1109/LGRS.2015.2452962
Article Google Scholar
McGlone JC, Shufelt JA (1994) Projective and object space geometry for monocular building extraction. In: Proceedings of IEEE conference on computer vision and pattern recognition (CVPR94). IEEE, Seattle, WA, USA, pp 54–61. https://doi.org/10.1109/CVPR.1994.323810
McKeown DM, Bulwinkle T, Cochran S, Harvey W, McGlone C, Shufelt JA (2000) Performance evaluation for automatic feature extraction. Int Arch Photogramm Remote Sens Spat Inf Sci XXXII I(B2):379–394
Google Scholar
Nalani HA (2014) Automatic reconstruction of urban objects from mobile laser scanner data. Dissertation for awarding the academic degree Doktor-Ingenieur. Dresden, Germany
Ok AO, Senaras C, Yuksel B (2013) Automated detection of arbitrarily shaped buildings in complex environments from monocular VHR optical satellite imagery. IEEE Trans Geosci Remote Sens 51(3):1701–1717. https://doi.org/10.1109/TGRS.2012.2207123
Article Google Scholar
Oztimur Karadag O, Senaras C, Yarman Vural FT (2015) Segmentation fusion for building detection using domain-specific information. IEEE J Sel Top Appl Earth Obs Remote Sens 8(7):3305–3315. https://doi.org/10.1109/JSTARS.2015.2403617
Article Google Scholar
Phung SL, Bouzerdoum A (2009) Matlab library for convolutional neural networks. Technical report, ICT research institute, visual and audio signal processing lab, university of Wollongong. https://www.uow.edu.au/~phung. Accessed 15 Aug 2016
Redmon J, Divvala S, Girshick R, Farhadi A (2016) You only look once: unified, real-time object detection. In: Proceedings of IEEE conference on computer vision and pattern recognition (CVPR2016). IEEE, Las Vegas, NV, USA, pp 779–788. https://doi.org/10.1109/CVPR.2016.91
Ren S, He K, Girshick R, Sun J (2016) Faster R-CNN: towards real-time object detection with region proposal networks. IEEE Trans Pattern Anal Mach Intell 39:1137–1149. https://doi.org/10.1109/TPAMI.2016.2577031
Rottensteiner F, Trinder J, Clode S, Kubik K (2007) Building detection by fusion of airborne laser scanner data and multi-spectral images: performance evaluation and sensitivity analysis. ISPRS J Photogramm Remote Sens 62:135–149. https://doi.org/10.1016/j.isprsjprs.2007.03.001
Article Google Scholar
Saito S, Aoki Y (2015) Building and road detection from large aerial imagery. Proc. SPIE 9405, Image processing: machine vision applications VIII:94050K. https://doi.org/10.1117/12.2083273
Google Scholar
Sampath A, Shan J (2010) Segmentation and reconstruction of polyhedral building roofs from aerial lidar point clouds. IEEE Trans Geosci Remote Sens 48(3):1554–1567. https://doi.org/10.1109/TGRS.2009.2030180
Article Google Scholar
Schmidhuber J (2015) Deep Learning in neural networks: an overview. Neural Networks 61:85–117. https://doi.org/10.1016/j.neunet.2014.09.003
Article Google Scholar
Senaras C, Vural FTY (2016) A self-supervised decision fusion framework for building detection. IEEE J Sel Top Appl Earth Obs Remote Sens 9(5):1780–1791. https://doi.org/10.1109/JSTARS.2015.2463118
Article Google Scholar
Simonyan K, Zisserman A (2015) Very deep convolutional networks for large-scale image recognition. arXiv:1409.1556
Singh G, Jouppi M, Zhang Z, Zakhor A (2015) Shadow based building extraction from single satellite image. Comput Imaging XIII:94010F. https://doi.org/10.1117/12.2083500
Google Scholar
Tuia D, Flamary R, Courty N (2015) Multiclass feature learning for hyperspectral image classification: sparse and hierarchical solutions. ISPRS J Photogramm Remote Sens 105:272–285. https://doi.org/10.1016/j.isprsjprs.2015.01.006
Article Google Scholar
Uijlings JRR, Van De Sande KEA, Gevers T, Smeulders AWM (2013) Selective search for object recognition. Int J Comput Vis 104(2):154–171. https://doi.org/10.1007/s11263-013-0620-5
Article Google Scholar
Vakalopoulou M, Karantzalos K, Komodakis N, Paragios N (2015) Building detection in very high resolution multispectral data with deep learning features. In: Proceedings of IEEE international geoscience remote sensing symposium (IGARSS2015). IEEE, Milan, Italy, pp 1873–1876. https://doi.org/10.1109/igarss.2015.7326158
Vedaldi A, Lenc K (2015) MatConvNet-Convolutional neural networks for MATLAB. In: Proceedings of the ACM international conference on multimedia. ACM, Brisbane, Australia, pp 689–692. https://doi.org/10.1145/2733373.2807412
Von Gioi RG, Jakubowicz J, Morel J-M, Randall G (2010) LSD: a fast line segment detector with a false detection control. IEEE Trans Pattern Anal Mach Intell 32(4):722–732. https://doi.org/10.1109/TPAMI.2008.300
Article Google Scholar
Vu TT, Yamazaki F, Matsuoka M (2009) Multi-scale solution for building extraction from lidar and image data. Int J Appl Earth Obs Geoinf 11(4):281–289. https://doi.org/10.1016/j.jag.2009.03.005
Article Google Scholar
Yu B, Liu H, Wu J et al (2010) Automated derivation of urban building density information using airborne lidar data and object-based method. Landsc Urban Plan 98(3–4):210–219. https://doi.org/10.1016/j.landurbplan.2010.08.004
Article Google Scholar
Yuan J (2016) Automatic building extraction in aerial scenes using convolutional networks. http://jiangyeyuan.com/bldgExt.html. arXiv:1602.06564. Accessed 15 Jan 2017
Zeiler MD, Fergus R (2014) Visualizing and understanding convolutional networks. Comput vision–ECCV 2014 8689:818–833. https://doi.org/10.1007/978-3-319-10590-1_53
Google Scholar
Zhang K, Yan J, Chen SC (2006) Automatic construction of building footpoints from airborne lidar data. IEEE Trans Geosci Remote Sens 44(9):2523–2533. https://doi.org/10.1109/TGRS.2006.874137
Article Google Scholar
Zhang Y, Sohn K, Villegas R et al (2015) Improving object detection with deep convolutional networks via bayesian optimization and structured prediction. In: Proceedings of IEEE conference on computer vision and pattern recognition (CVPR2015), Boston, MA, USA, pp 249–258. https://doi.org/10.1109/cvpr.2015.7298621
Zhang Q, Wang Y, Liu Q et al (2016) CNN based suburban building detection using monocular high resolution google earth images. In: Proceedings of IEEE international geoscience remote sensing symposium (IGARSS2016). IEEE, Beijing, China, pp 661–664. https://doi.org/10.1109/IGARSS.2016.7729166
Zuo Z, Wang G (2014) Learning discriminative hierarchical features for object recognition. IEEE Signal Process Lett 21(9):1159–1163. https://doi.org/10.1109/LSP.2014.2298888
Article Google Scholar

Download references

Acknowledgements

The Vaihingen and Potsdam data sets are provided by the German Society for Photogrammetry, Remote Sensing and Geoinformation (DGPF) (ISPRS 2012; Cramer 2010) which is acknowledged by authors.

Author information

Authors and Affiliations

School of Surveying and Geospatial Engineering, College of Engineering, University of Tehran, Tehran, Iran
Fatemeh Alidoost & Hossein Arefi

Authors

Fatemeh Alidoost
View author publications
You can also search for this author in PubMed Google Scholar
Hossein Arefi
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Hossein Arefi.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Alidoost, F., Arefi, H. A CNN-Based Approach for Automatic Building Detection and Recognition of Roof Types Using a Single Aerial Image. PFG 86, 235–248 (2018). https://doi.org/10.1007/s41064-018-0060-5

Download citation

Received: 31 May 2017
Accepted: 14 December 2018
Published: 15 January 2019
Issue Date: 12 December 2018
DOI: https://doi.org/10.1007/s41064-018-0060-5

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A CNN-Based Approach for Automatic Building Detection and Recognition of Roof Types Using a Single Aerial Image

Abstract

Zusammenfassung

Access this article

Similar content being viewed by others

Towards open-source LOD2 modelling using convolutional neural networks

Convolutional Neural Network for Detection of Building Contours Using Multisource Spatial Data

Roof Defect Segmentation on Aerial Images Using Neural Networks

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

A CNN-Based Approach for Automatic Building Detection and Recognition of Roof Types Using a Single Aerial Image

Abstract

Zusammenfassung

Access this article

Similar content being viewed by others

Towards open-source LOD2 modelling using convolutional neural networks

Convolutional Neural Network for Detection of Building Contours Using Multisource Spatial Data

Roof Defect Segmentation on Aerial Images Using Neural Networks

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation