Skip to main content

Semantic Map Annotation Through UAV Video Analysis Using Deep Learning Models in ROS

  • Conference paper
  • First Online:
MultiMedia Modeling (MMM 2019)

Abstract

Enriching the map of the flight environment with semantic knowledge is a common need for several UAV applications. Safety legislations require no-fly zones near crowded areas that can be indicated by semantic annotations on a geometric map. This work proposes an automatic annotation of 3D maps with crowded areas, by projecting 2D annotations that are derived through visual analysis of UAV video frames. To this aim, a fully convolutional neural network is proposed, in order to comply with the computational restrictions of the application, that can effectively distinguish between crowded and non-crowded scenes based on a regularized multiple-loss training method, and provide semantic heatmaps that are projected on the 3D occupancy grid of Octomap. The projection is based on raycasting and leads to polygonal areas that are geo-localized on the map and could be exported in KML format. Initial qualitative evaluation using both synthetic and real world drone scenes, proves the applicability of the method.

The research leading to these results has received funding from the European Unions Horizon 2020 research and innovation programme under grant agreement number 731667 (MULTIDRONE). This publication reflects only the authors views. The European Union is not liable for any use that may be made of the information contained therein.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    http://publicapps.caa.co.uk/docs/33/CAP393_E5A3_MAR2018(p).pdf.

  2. 2.

    https://www.enac.gov.it/repository/ContentManagement/information/N1220929004/Regulation_RPAS_Issue_2_Rev 2_eng.pdf.

  3. 3.

    https://www.sensefly.com/drones/example-datasets.html.

  4. 4.

    https://ivul.kaust.edu.sa/Pages/Dataset-UAV123.aspx.

  5. 5.

    https://www.dji.com/phantom-4/info.

References

  1. Anand, A., Koppula, H.S., Joachims, T., Saxena, A.: Contextually guided semantic labeling and search for three-dimensional point clouds. Int. J. Robot. Res. 32(1), 19–34 (2013)

    Article  Google Scholar 

  2. Babu Sam, D., Surya, S., Venkatesh Babu, R.: Switching convolutional neural network for crowd counting. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5744–5752 (2017)

    Google Scholar 

  3. Boominathan, L., Kruthiventi, S.S., Babu, R.V.: CrowdNet: a deep convolutional network for dense crowd counting. In: Proceedings of the 2016 ACM on Multimedia Conference, pp. 640–644. ACM (2016)

    Google Scholar 

  4. Caruana, R.: Multitask learning. Mach. Learn. 28(1), 41–75 (1997)

    Article  MathSciNet  Google Scholar 

  5. Douglas, D.H., Peucker, T.K.: Algorithms for the reduction of the number of points required to represent a digitized line or its caricature. Cartographica Int. J. Geogr. Inf. Geovisualization 10(2), 112–122 (1973)

    Article  Google Scholar 

  6. Friedman, S., Pasula, H., Fox, D.: Voronoi random fields: extracting topological structure of indoor environments via place labeling. IJCAI 7, 2109–2114 (2007)

    Google Scholar 

  7. Garcia-Garcia, A., Orts-Escolano, S., Oprea, S., Villena-Martinez, V., Garcia-Rodriguez, J.: A review on deep learning techniques applied to semantic segmentation. arXiv preprint arXiv:1704.06857 (2017)

  8. Glassner, A.S.: An Introduction to Ray Tracing. Elsevier, Amsterdam (1989)

    MATH  Google Scholar 

  9. Gorelick, N., Hancher, M., Dixon, M., Ilyushchenko, S., Thau, D., Moore, R.: Google earth engine: planetary-scale geospatial analysis for everyone. Remote Sens. Environ. 202, 18–27 (2017)

    Article  Google Scholar 

  10. Hornung, A., Wurm, K.M., Bennewitz, M., Stachniss, C., Burgard, W.: OctoMap: an efficient probabilistic 3D mapping framework based on octrees. Autonom. Robots 34(3), 189–206 (2013)

    Article  Google Scholar 

  11. Kaneko, K., Ohta, N.: 4K applications beyond digital cinema, pp. 133–136. IEEE (2010)

    Google Scholar 

  12. Karis, B., Games, E.: Real shading in unreal engine 4. In: Proceedings of Physically Based Shading Theory Practice, pp. 621–635 (2013)

    Google Scholar 

  13. Le Cun, B.B., Denker, J.S., Henderson, D., Howard, R.E., Hubbard, W., Jackel, L.D.: Handwritten digit recognition with a back-propagation network. Advances in Neural Information Processing Systems, vol. 2, pp. 396–404. Morgan Kaufmann Publishers Inc., San Mateo (1990)

    Google Scholar 

  14. Liu, W., et al.: SSD: single shot MultiBox detector. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9905, pp. 21–37. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46448-0_2

    Chapter  Google Scholar 

  15. Mitsou, N., et al.: Online semantic mapping of urban environments. In: Stachniss, C., Schill, K., Uttal, D. (eds.) Spatial Cognition 2012. LNCS (LNAI), vol. 7463, pp. 54–73. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-32732-2_4

    Chapter  Google Scholar 

  16. de Nijs, R., Ramos, S., Roig, G., Boix, X., Van Gool, L., Kühnlenz, K.: On-line semantic perception using uncertainty. In: 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 4185–4191. IEEE (2012)

    Google Scholar 

  17. Pangercic, D., Pitzer, B., Tenorth, M., Beetz, M.: Semantic object maps for robotic housework-representation, acquisition and use, In: 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 4644–4651. IEEE (2012)

    Google Scholar 

  18. Polastro, R., Corrêa, F., Cozman, F., Okamoto, J.: Semantic mapping with a probabilistic description logic. In: da Rocha Costa, A.C., Vicari, R.M., Tonidandel, F. (eds.) SBIA 2010. LNCS (LNAI), vol. 6404, pp. 62–71. Springer, Heidelberg (2010). https://doi.org/10.1007/978-3-642-16138-4_7

    Chapter  Google Scholar 

  19. Pronobis, A., Jensfelt, P.: Large-scale semantic mapping and reasoning with heterogeneous modalities. In: 2012 IEEE International Conference on Robotics and Automation (ICRA), pp. 3515–3522. IEEE (2012)

    Google Scholar 

  20. Quigley, M., et al.: ROS: an open-source robot operating system. In: ICRA Workshop on Open Source Software, vol. 3, p. 5. Kobe, Japan (2009)

    Google Scholar 

  21. Remolina, E., Kuipers, B.: Towards a general theory of topological maps. Artif. Intell. 152(1), 47–104 (2004)

    Article  MathSciNet  Google Scholar 

  22. Roth, S.D.: Ray casting for modeling solids. Comput. Graph. Image Process. 18(2), 109–144 (1982)

    Article  Google Scholar 

  23. Shah, S., Dey, D., Lovett, C., Kapoor, A.: AirSim: high-fidelity visual and physical simulation for autonomous vehicles. In: Field and Service Robotics (2017). https://arxiv.org/abs/1705.05065

  24. Shao, J., Kang, K., Change Loy, C., Wang, X.: Deeply learned attributes for crowded scene understanding. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4657–4666 (2015)

    Google Scholar 

  25. Szegedy, C., Ioffe, S., Vanhoucke, V., Alemi, A.A.: Inception-v4: inception-ResNet and the impact of residual connections on learning. In: AAAI, vol. 4, p. 12 (2017)

    Google Scholar 

  26. Toshev, A., Szegedy, C.: DeepPose: human pose estimation via deep neural networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1653–1660 (2014)

    Google Scholar 

  27. Tzelepi, M., Tefas, A.: Human crowd detection for drone flight safety using convolutional neural networks. In: 2017 25th European Signal Processing Conference (EUSIPCO), pp. 743–747. IEEE (2017)

    Google Scholar 

  28. Tzelepi, M., Tefas, A.: Deep convolutional learning for content based image retrieval. Neurocomputing 275, 2467–2478 (2018)

    Article  Google Scholar 

  29. Zender, H., Mozos, O.M., Jensfelt, P., Kruijff, G.J., Burgard, W.: Conceptual spatial representations for indoor mobile robots. Robot. Autonom. Syst. 56(6), 493–502 (2008)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Nikos Nikolaidis .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Kakaletsis, E. et al. (2019). Semantic Map Annotation Through UAV Video Analysis Using Deep Learning Models in ROS. In: Kompatsiaris, I., Huet, B., Mezaris, V., Gurrin, C., Cheng, WH., Vrochidis, S. (eds) MultiMedia Modeling. MMM 2019. Lecture Notes in Computer Science(), vol 11296. Springer, Cham. https://doi.org/10.1007/978-3-030-05716-9_27

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-05716-9_27

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-05715-2

  • Online ISBN: 978-3-030-05716-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics