Continuous drone control using deep reinforcement learning for frontal view person shooting

Passalis, Nikolaos; Tefas, Anastasios

doi:10.1007/s00521-019-04330-6

Continuous drone control using deep reinforcement learning for frontal view person shooting

Emerging Trends of Applied Neural Computation - E_TRAINCO
Published: 08 July 2019

Volume 32, pages 4227–4238, (2020)
Cite this article

Neural Computing and Applications Aims and scope Submit manuscript

1340 Accesses
24 Citations
Explore all metrics

Abstract

Drones, also known as unmanned aerial vehicles, can be used to aid various aerial cinematography tasks. However, using drones for aerial cinematography requires the coordination of several people, increasing the cost and reducing the shooting flexibility, while also increasing the cognitive load of the drone operators. To overcome these limitations, we propose a deep reinforcement learning (RL) method for continuous fine-grained drone control, that allows for acquiring high-quality frontal view person shots. To this end, a head pose image dataset is combined with 3D models and face alignment/warping techniques to develop an RL environment that realistically simulates the effects of the drone control commands. An appropriate reward-shaping approach is also proposed to improve the stability of the employed continuous RL method. Apart from performing continuous control, it was demonstrated that the proposed method can be also effectively combined with simulation environments that support only discrete control commands, improving the control accuracy, even in this case. The effectiveness of the proposed technique is experimentally demonstrated using several quantitative and qualitative experiments.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 2

Recent Advances in Unmanned Aerial Vehicles: A Review

Article 25 April 2022

BEVFormer: Learning Bird’s-Eye-View Representation from Multi-camera Images via Spatiotemporal Transformers

Human Extinction and AI: What We Can Learn from the Ultimate Threat

Article Open access 01 February 2024

References

Ang KH, Chong G, Li Y (2005) PID control system analysis, design, and technology. IEEE Trans Control Syst Technol 13(4):559–576
Article Google Scholar
Arulkumaran K, Deisenroth MP, Brundage M, Bharath AA (2017) Deep reinforcement learning: a brief survey. IEEE Signal Process Mag 34(6):26–38
Article Google Scholar
Åström KJ, Hägglund T, Astrom KJ (2006) Advanced PID control, vol 461. ISA-The Instrumentation, Systems, and Automation Society, Research Triangle Park, NC
Brockman G, Cheung V, Pettersson L, Schneider J, Schulman J, Tang J, Zaremba W (2016) Openai gym. Technical Report. arXiv:1606.01540
Bryson AE (1975) Applied optimal control: optimization, estimation and control. CRC Press, Boca Raton
Google Scholar
Chung J, Gulcehre C, Cho K, Bengio Y (2014) Empirical evaluation of gated recurrent neural networks on sequence modeling. arXiv preprint arXiv:1412.3555
Claesson A, Fredman D, Svensson L, Ringh M, Hollenberg J, Nordberg P, Rosenqvist M, Djarv T, Österberg S, Lennartsson J et al (2016) Unmanned aerial vehicles (drones) in out-of-hospital-cardiac-arrest. Scand J Trauma Resusc Emerg Med 24(1):124
Article Google Scholar
Duan Y, Chen X, Houthooft R, Schulman J, Abbeel P (2016) Benchmarking deep reinforcement learning for continuous control. In: Proceedings of the international conference on machine learning, pp 1329–1338
Finn C, Levine S (2017) Deep visual foresight for planning robot motion. In: Proceedings of the IEEE international conference on robotics and automation, pp 2786–2793
Finn C, Yu T, Fu J, Abbeel P, Levine S (2016) Generalizing skills with semi-supervised reinforcement learning. arXiv preprint arXiv:1612.00429
Galvane Q, Fleureau J, Tariolle FL, Guillotel P (2017) Automated cinematography with unmanned aerial vehicles. arXiv preprint arXiv:1712.04353
Garcia-Garcia A, Orts-Escolano S, Oprea S, Garcia-Rodriguez J, Azorin-Lopez J, Saval-Calvo M, Cazorla M (2017) Multi-sensor 3D object dataset for object recognition with full pose estimation. Neural Comput Appl 28(5):941–952
Article Google Scholar
Gourier N, Hall D, Crowley JL (2004) Estimating face orientation from robust detection of salient facial features. In: ICPR international workshop on visual observation of deictic gestures. Citeseer
Gu S, Holly E, Lillicrap T, Levine S (2017) Deep reinforcement learning for robotic manipulation with asynchronous off-policy updates. In: Proceedings of the IEEE international conference on robotics and automation, pp 3389–3396
Gynnild A (2014) The robot eye witness: extending visual journalism through drone surveillance. Digit J 2(3):334–343
Google Scholar
Hasselt HV, Guez A, Silver D (2016) Deep reinforcement learning with double q-learning. In: Proceedings of the AAAI conference on artificial intelligence, AAAI’16, pp 2094–2100. AAAI Press. http://dl.acm.org/citation.cfm?id=3016100.3016191
Haykin S, Network N (2004) A comprehensive foundation. Neural Netw 2(2004):41
Google Scholar
Ioffe S, Szegedy C (2015) Batch normalization: accelerating deep network training by reducing internal covariate shift. In: Proceedings of the international conference on machine learning, pp 448–456
Kazemi V, Josephine S (2014) One millisecond face alignment with an ensemble of regression trees. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1867–1874
Kirk DE (2012) Optimal control theory: an introduction. Courier Corporation, North Chelmsford
Google Scholar
Krüll W, Tobera R, Willms I, Essen H, von Wahl N (2012) Early forest fire detection and verification using optical smoke, gas and microwave sensors. Procedia Eng 45:584–594
Article Google Scholar
Levine S, Pastor P, Krizhevsky A, Quillen D (2016) Learning hand–eye coordination for robotic grasping with large-scale data collection. In: Proceedings of the international symposium on experimental robotics, pp 173–184
Li Y (2017) Deep reinforcement learning: an overview. arXiv preprint arXiv:1701.07274
Lillicrap TP, Hunt JJ, Pritzel A, Heess N, Erez T, Tassa Y, Silver D, Wierstra D (2015) Continuous control with deep reinforcement learning. arXiv preprint arXiv:1509.02971
Mnih V, Badia AP, Mirza M, Graves A, Lillicrap T, Harley T, Silver D, Kavukcuoglu K (2016) Asynchronous methods for deep reinforcement learning. In: Proceedings of the international conference on machine learning, pp 1928–1937
Mnih V, Kavukcuoglu K, Silver D, Rusu AA, Veness J, Bellemare MG, Graves A, Riedmiller M, Fidjeland AK, Ostrovski G et al (2015) Human-level control through deep reinforcement learning. Nature 518(7540):529
Article Google Scholar
Nägeli T, Meier L, Domahidi A, Alonso-Mora J, Hilliges O (2017) Real-time planning for automated multi-view drone cinematography. ACM Trans Graph 36(4):132
Article Google Scholar
Ng AY, Harada, Russell S (1999) Policy invariance under reward transformations: theory and application to reward shaping. In: Proceedings of the international conference on machine learning
Nousi P, Patsiouras E, Tefas A, Pitas I (2018) Convolutional neural networks for visual information analysis with limited computing resources. In: Proceedings of the IEEE international conference on image processing, pp 321–325
Ohayon S, Rivlin E (2006) Robust 3D head tracking using camera pose estimation. Proc Int Confer Pattern Recognit 1:1063–1066
Google Scholar
Pan SJ, Yang Q et al (2010) A survey on transfer learning. IEEE Trans Knowl Data Eng 22(10):1345–1359
Article Google Scholar
Passalis N, Tefas A (2017) Concept detection and face pose estimation using lightweight convolutional neural networks for steering drone video shooting. In: Proceedings of the European signal processing conference, pp 71–75
Passalis N, Tefas A (2018) Deep reinforcement learning for frontal view person shooting using drones. In: Proceedings of the IEEE conference on evolving and adaptive intelligent systems, pp 1–8
Passalis N, Tefas A, Pitas I (2018) Efficient camera control using 2D visual information for unmanned aerial vehicle-based cinematography. In: Proceedings of the international symposium on circuits and systems (to appear)
Passalis N, Tefas A (2018) Training lightweight deep convolutional neural networks using bag-of-features pooling. IEEE Trans Neural Netw Learn Syst 30(6):1705–1715
Article MathSciNet Google Scholar
Passalis N, Tefas A (2019) Deep reinforcement learning for controlling frontal person close-up shooting. Neurocomputing 335:37–47
Article Google Scholar
Plappert M (2016) Keras-rl. https://github.com/matthiasplappert/keras-rl. Accessed 6 July 2019
Ranjan R, Patel VM, Chellappa R (2017) Hyperface: a deep multi-task learning framework for face detection, landmark localization, pose estimation, and gender recognition. IEEE Trans Pattern Anal Mach Intell 41:121–135
Article Google Scholar
Sutton RS, Barto AG (1998) Reinforcement learning: an introduction, vol 1. MIT Press, Cambridge
MATH Google Scholar
Tang L, Liu YJ, Tong S (2014) Adaptive neural control using reinforcement learning for a class of robot manipulator. Neural Comput Appl 25(1):135–141
Article Google Scholar
Taylor ME, Stone P (2009) Transfer learning for reinforcement learning domains: a survey. J Mach Learn Res 10:1633–1685
MathSciNet MATH Google Scholar
Tieleman T, Hinton G (2012) Lecture 6.5-rmsprop: divide the gradient by a running average of its recent magnitude. COURSERA Neural Netw Mach Learn 4(2):26–31
Google Scholar
Toshev A, Szegedy C (2014) DeepPose: human pose estimation via deep neural networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1653–1660
Triantafyllidou D, Nousi P, Tefas A (2017) Fast deep convolutional face detection in the wild exploiting hard sample mining. Big Data Res 22:65
Google Scholar
Tzelepi M, Tefas A (2017) Human crowd detection for drone flight safety using convolutional neural networks. In: Proceedings of the European signal processing conference, pp 743–747
Van Hasselt H, Guez A, Silver D (2016) Deep reinforcement learning with double Q-learning. Proc AAAI Confer Artif Intell 16:2094–2100
Google Scholar
Vong CM, Tai KI, Pun CM, Wong PK (2015) Fast and accurate face detection by sparse Bayesian extreme learning machine. Neural Comput Appl 26(5):1149–1156
Article Google Scholar
Wooldridge M (2009) An introduction to multiagent systems. Wiley, New York
Google Scholar
Yang S, Luo P, Loy CC, Tang X (2016) Wider face: a face detection benchmark. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 5525–5533
Zhang F, Leitner J, Milford M, Upcroft B, Corke P (2015) Towards vision-based deep reinforcement learning for robotic motion control. arXiv preprint arXiv:1511.03791
Zhu JY, Agarwala A, Efros AA, Shechtman E, Wang J (2014) Mirror mirror: crowdsourcing better portraits. ACM Trans Graph (SIGGRAPH Asia 2014) 33(6):11
Google Scholar

Download references

Acknowledgements

This project has received funding from the European Union’s Horizon 2020 research and innovation programme under Grant Agreement No. 731667 (MULTIDRONE). This publication reflects the authors’ views only. The European Commission is not responsible for any use that may be made of the information it contains.

Author information

Authors and Affiliations

Department of Informatics, Aristotle University of Thessaloniki, 541 24, Thessaloniki, Greece
Nikolaos Passalis & Anastasios Tefas

Authors

Nikolaos Passalis
View author publications
You can also search for this author in PubMed Google Scholar
Anastasios Tefas
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Nikolaos Passalis.

Ethics declarations

Conflict of interest

The authors declare that they have no conflicts of interest.

Ethical approval

This article does not contain any studies with human participants or animals performed by any of the authors.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Passalis, N., Tefas, A. Continuous drone control using deep reinforcement learning for frontal view person shooting. Neural Comput & Applic 32, 4227–4238 (2020). https://doi.org/10.1007/s00521-019-04330-6

Download citation

Received: 31 October 2018
Accepted: 28 June 2019
Published: 08 July 2019
Issue Date: May 2020
DOI: https://doi.org/10.1007/s00521-019-04330-6

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Continuous drone control using deep reinforcement learning for frontal view person shooting

Abstract

Access this article

Similar content being viewed by others

Recent Advances in Unmanned Aerial Vehicles: A Review

BEVFormer: Learning Bird’s-Eye-View Representation from Multi-camera Images via Spatiotemporal Transformers

Human Extinction and AI: What We Can Learn from the Ultimate Threat

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Ethical approval

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Continuous drone control using deep reinforcement learning for frontal view person shooting

Abstract

Access this article

Similar content being viewed by others

Recent Advances in Unmanned Aerial Vehicles: A Review

BEVFormer: Learning Bird’s-Eye-View Representation from Multi-camera Images via Spatiotemporal Transformers

Human Extinction and AI: What We Can Learn from the Ultimate Threat

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Ethical approval

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation