Elsevier

Automatica

Volume 114, April 2020, 108793
Automatica

Experimental analysis of a game-theoretic formulation of target tracking

https://doi.org/10.1016/j.automatica.2019.108793Get rights and content

Abstract

Optimal trajectories for two platforms with similar dynamics are calculated using a game theoretic formulation. Each platform makes noisy observations of the kinematic state of the other. The objective of each is to maximise observable information about the other while minimising the information the other is able to acquire about it. That is to say, each platform maximises the mutual information between the expected future measurement of the opposing platform and the current likelihood of the state whilst minimising the estimated mutual information between potential measurements of itself by the other and its actual state. The multi-objective optimisation problem for each platform is converted to a single optimisation using the Pareto parameter to weigh the relative importance of the two information measures. The relationship between the two Pareto parameters, and different initial track initialisations is investigated. Remarkably this complex coupled system of two platforms exhibits, for suitably chosen values of the Pareto parameters, interesting cyclical behaviours that are worthy of further exploration.

Introduction

In the target tracking research literature, it is often assumed that the sensor state is independent of the state of interest (Bar-Shalom et al., 2001, Smith and Singh, 2006). However, in the real-world, the dependence between tracking performance and the sensor state is present which cannot be ignored. Optimising sensor state to improve tracking performance is an important research topic that has studied in the literature under various names, such as sensor path planning, mobile sensors coordinating, sensor scheduling, etc. (Chhetri et al., 2007, Oshman and Davidson, 1999, Passerieux and Van Cappel, 1998, Vitus et al., 2012, Yang et al., 2014). For instance, in bearings-only target tracking with a single platform, the platform must perform manoeuvres with relative to the target state to observe the target (Cheng et al., 2013, Doğançay, 2005, Nardone and Aidala, 1981). In cooperative tracking, deployment of sensors from multiple mobile robots or unmanned aerial vehicles (UAVs) can achieve an efficient, accurate and reliable target tracking strategy, even with limited sensing capability (Lin et al., 2009, Martínez and Bullo, 2006, Xu and Doğançay, 2015). Actually, the motion of sensor platforms may be driven by an optimal control policy incorporating both state estimation feedback and measurement uncertainties to approach arbitrarily closely an optimal objective function (Charrow, Michael, and Kumar, 2014, Hoffmann and Tomlin, 2010). In the above cases, the impact of the state of the sensors on tracking accuracy performance is significant.

In sensor path planning, two metrics of information related to the underlying target state are widely used: the Fisher information matrix (or the posterior Cramér–Rao lower bound) and mutual information, which are approximately calculated in difference situations. In bearing-only target tracking, the optimal path of the passive sensor is determined by minimising the trace of the Cramér–Rao lower bound matrix (i.e., the inverse of the Fisher information matrix) of the current or future source position errors and velocity errors, or equivalently maximising the trace of the Fisher information matrix in Helferty and Mudgett (1993), in the premise that the corresponding posterior probability density is Gaussian. Moreover, the optimisation of observer trajectories for bearings-only fixed-target localisation is presented in Oshman and Davidson (1999) and the corresponding numerical scheme is also implemented under the criterion of maximising the determinant of the Fisher information matrix. A framework for the systematic management of multiple sensors in target tracking in clutters is described in Hernandez, Kirubarajan, and Bar-Shalom (2004) by the means of approximating the posterior Cramér–Rao lower bound in order to achieve accurate target state estimation. The approach is applied to track a submarine by deploying a series of constant false-alarm rate passive sonobuoys. This framework is further extended with work addressing the issues of imperfect sensor placement and uncertain sensor movement by minimising the posterior Cramér–Rao lower bound in Punithakumar, Kirubarajan, and Hernandez (2006).

On the other hand, the mutual information is used as an alternative criterion for mobile sensor path planning (Charrow, Kumar, and Michael, 2014, Charrow, Michael, and Kumar, 2014, Hoffmann and Tomlin, 2010). A distributed control architecture to facilitate search by a mobile sensor network is developed (Hoffmann & Tomlin, 2010), by maximising the mutual information between the sensors and the interested target state through using a particle filter representation of the posterior probability distribution, to control the mobile sensors such that future observations minimise the expected future uncertainty of the target state. Moreover, an active control strategy that allows a team of robots equipped with range sensors to localise an unknown and stationary target in non-convex indoor environments is developed (Charrow, Michael, & Kumar, 2014), through maximising the mutual information between the robot’s measurements and their current belief of the target position, based on adopting particle filtering for estimation. Furthermore, a control policy for maximising mutual information over a finite time horizon based on Gaussian mixture models approximating the posterior probability distributions is presented to enable a team of robots to estimate the location of a mobile target using range-only sensors (Charrow, Kumar, & Michael, 2014).

These sensor trajectory optimisations only consider unilateral path planning; that is, they use the estimated state of the observed platform to guide the motion of the sensor platform. It is likely that, in a real situation, both the sensor and the target will act in an intelligent way. Both will detect and track the state of the other and will adjust their own motions, using the estimated state of the other, in order to achieve their intentions. For example, in a situation of two UAVs in pursuit, the pursuer UAV might detect the other from the return signal transmitted by itself (as an active radar) and steers its own motion close to the other to minimise track error, while it also attempts not to be detected by the other. In the meantime, the other UAV may estimate the state of the pursuer with a passive sensing capability and attempt to discern the intention of the other as accurately as possible, while again minimising the information it allows the other to gain. The corresponding sensor trajectory scheduling problem should be explored in this light for systems with location dependent sensors.

In this paper, we extend this idea to the target tracking problem and formulate a chasing and escaping scenario for two moving platforms which have similar sensing and kinematic capabilities. Each platform maintains a model not only of its opponent’s state but also of its opponent’s estimate of its own state. Two cost functions are derived for each of the two platforms. The one is measuring the information gained about the opponent and the other is an estimate of the information of the opponent known about itself. The tracking performance of each is examined using Pareto optimisation, where we construct a single cost function as a linear combination of the original costs by introducing a constant parameter λ[0,1]. Each platform has its own Pareto parameter that balances its curiosity, requirement to obtain better information about the other, and paranoia, unwillingness to allow the other to obtain information about it.

The rest of this paper is organised as follows. A detailed formulation of the chase-escape scenario is presented in Section 2. Section 3 gives the model, optimisation and numerical implementation based on mutual information. A simulation analysis is shown in Section 4, where typical trajectory classes of the two platforms and the relationship between the possible trajectory patterns and the Pareto weighting parameters under various track initialisations are discussed. Finally, a conclusion is given in Section 5.

Notations

Throughout this paper, Rn is the space of n-dimensional real vectors. The symbol “” means definition and “” refers to the Kronecker product. “||” represents the cardinality of the corresponding set.

Section snippets

Chase-escape scenario

Consider a scenario of chasing and escaping involving platforms A and B. Each platform attempts to estimate the state of the other over time and reduce the estimation uncertainty by adjusting its own kinematic state, while it also attempts to maximise the estimation uncertainty of its own state by the other platform.

Let xkRnx and skRns represent the state vectors over the time index k of platforms A and B, respectively. The measurements of these states are zkRnz for those of A by B, and ykRn

Trajectory control under mutual information based motion-driven models

The systems, (1)–(4), are assumed to be known to both platforms. Our goal is to design appropriate cost functions toinfluence the motion control of the two platforms in the chasing-escaping scenario, such that the platform motion scheduling requirements for both sides are equally and optimally satisfied.

Each platform attempts to adjust its motion state, based on its measurements of the other and its estimate of the other platform’s measurements of itself, such that at the next epoch it can

Simulation analysis

In this section, the two-platform chasing-escaping scenario is described. Assume that the platforms A and B are moving in the (ζ,η) plane and both have identical manoeuvring capabilities with a set of three manoeuvres: turning left (ϑk1,μk1=LT), turning right (ϑk2,μk2=RT) and constant velocity (ϑk3,μk3=CV), each parameterised by a turn angle ψ and a time step T. The dynamical model for platform A is given by xk+1=Fϑkxk+Gϑkwk,and replacing xk, ϑk and wk by sk, μk and ϖk respectively gives the

Conclusion

In this paper, a trajectory planning for two platforms was investigated based on maximising both the information gained from sensors (curiosity) and minimising the information gained by the opposing platform (paranoia). This multi-objective optimisation was solved by the introduction of Pareto parameters balancing the two objectives. The weighting between these two for each platform, generated a constrained range of behaviours, mediated by the difference in the sensing paradigms: one platform

Yanbo Yang obtained the B.E., M.E. and Ph.D. degrees from the School of Automation at Northwestern Polytechnical University (NPU) in 2011, 2014 and 2016, respectively. He had studied in the Honours College of NPU from 2007 to 2010. He had been a two-year visiting Ph.D. student at the University of Melbourne from December 2014 to November 2016.

He is currently a lecturer with the School of Mechano-Electronic Engineering, Xidian University from 2017. His main research interests include target

References (25)

  • CoverT.M. et al.

    Elements of information theory

    (2006)
  • Helferty, J. P., & Mudgett, D. R. (1993). Optimal observer trajectories for bearings-only tracking by minimizing the...
  • Cited by (6)

    Yanbo Yang obtained the B.E., M.E. and Ph.D. degrees from the School of Automation at Northwestern Polytechnical University (NPU) in 2011, 2014 and 2016, respectively. He had studied in the Honours College of NPU from 2007 to 2010. He had been a two-year visiting Ph.D. student at the University of Melbourne from December 2014 to November 2016.

    He is currently a lecturer with the School of Mechano-Electronic Engineering, Xidian University from 2017. His main research interests include target tracking, estimation theory, and information fusion.

    Bill Moran (M’95) currently serves, since 2017, as Professor of Defence Technology in the University of Melbourne. From 2014 to 2017, he was Director of the Signal Processing and Sensor Control Group in the School of Engineering at RMIT University, from 2001 to 2014, a Professor in the Department of Electrical Engineering, University of Melbourne, Director of Defence Science Institute in University of Melbourne (2011-14), Professor of Mathematics (1976–1991), Head of the Department of Pure Mathematics (1977–79, 1984–86), Dean of Mathematical and Computer Sciences (1981, 1982, 1989) at the University of Adelaide, and Head of the Mathematics Discipline at the Flinders University of South Australia (1991–95). He was Head of the Medical Signal Processing Program (1995–99) in the Cooperative Research Centre for Sensor Signal and information Processing. He was a member of the Australian Research Council College of Experts from 2007 to 2009. He was elected to the Fellowship of the Australian Academy of Science in 1984. He holds a Ph.D. in Pure Mathematics from the University of Sheffield, UK (1968), and a First Class Honours B.Sc. in Mathematics from the University of Birmingham (1965). He has been a Principal Investigator on numerous research grants and contracts, in areas spanning pure mathematics to radar development, from both Australian and US Research Funding Agencies, including DARPA, AFOSR, AFRL, Australian Research Council (ARC), Australian Department of Education, Science and Training, and Defence Science and Technology, Australia. His main areas of research interest are in signal processing both theoretically and in applications to radar, waveform design and radar theory, sensor networks, and sensor management. He also works in various areas of mathematics including harmonic analysis, representation theory, and number theory.

    Xuezhi Wang received the B.S. degree in avionics from Northwest Polytechnical University, Xi’an, China, in 1982, and the Ph.D. degree in signal and systems from the Department of Electrical and Electronic Engineering, University of Melbourne, Australia, in 2001. He has been involved in radar and sonar signal processing research and development since 1983 and joint University of Melbourne from 1997 to 2015. He is currently a Senior Researcher with the Royal Melbourne Institute Technology University, Australia. He has more than 130 publications, including book chapters, refereed journal and conference papers, and technical reports. His research interests are in the areas of stochastic signal processing, information theory, Bayesian estimation, data fusion, and situation assessment.

    Timothy C. Brown gained his B.Sc. (Hons) from Monash University in 1974 and Ph.D. from Cambridge University in 1979, both with central interests in probability and statistics. He is Director of the Australian Mathematical Science Institute, a Joint venture of 12 Universities in Australia that is hosted at the University of Melbourne. His previous posts have been Professor of Statistical Data Science at University of Melbourne (2016–2018), Deputy Vice-Chancellor (Research) at La Trobe University (2008–2013), Dean of Science at Australian National University (2002–2007), Professor of Statistics at University of Melbourne (1992–2002), Professor of Probability and Statistics (1987–1992) at University of Western Australia, Director of Statistical Consulting Centre (1984–1987) at University of Melbourne and lecturing positions at Bath University in the UK (1978–1981) and Monash University (1981–1984). His research interests are broadly in probability and statistics and their applications, with a special interest in probability approximations.

    Simon Williams received the bachelor’s degree in mathematical physics and pure mathematics from The University of Adelaide. He went to Oxford on a Kobe Steel Scholarship to do graduate study with Roger Penrose on general relativity and conformal field theory.

    Since returning to Australia, he has taught mathematics at the university and from high school to honours level, where he was a Radar Signal Processor with DST Group, and built mathematical models of language and battles at CSIRO before joining Flinders University to work on computer aided screening mammography. He is currently a Senior Research Fellow with the Department of Electrical and Electronic Engineering, University of Melbourne. His research interests include modelling of algal ponds, X-ray images, optimal trajectories for underwater vehicles, and the fundamentals of fisher information metrics for sensors.

    Quan Pan received the B.Sc. degree from Huazhong Institute of Technology in 1991, and the M.Sc. and Ph.D. degrees from Northwestern Polytechnical University in 1991 and 1997, respectively. He is a professor of the School of Automation, and the Director of the Key Laboratory of Information Fusion Technology, Ministry of Education from 2013. He was Dean of the School of Automation of Northwestern Polytechnical University from 2009 to 2018. His research interests include information fusion, hybrid system estimation theory, multi-scale estimation theory, target tracking and image processing. He is a Board Member of the Chinese Association of Automation, and a member of the Chinese Association of Aeronautics and Astronautics. He obtained the 6th Chinese National Youth Award for Outstanding Contribution to Science and Technology in 1998 and the Chinese National New Century Excellent Professional Talent in 2000.

    This work is partially supported by National Natural Science Foundation of China (No. 61703324), National Natural Science Foundation of Shaanxi Province (No. 2019JQ-215), and the Fundamental Research Funds for the Central Universities (No. JB190403). It is also supported in part by the Asian Office of Aerospace Research & Development (AOARD)/AFRL under Grant No. 2386-15-1-4066. The material in this paper was not presented at any conference. This paper was recommended for publication in revised form by Associate Editor Brett Ninness under the direction of Editor Torsten Söderström

    View full text