2021 Special Issue on AI and Brain Science: Brain-inspired AIActive sensing with artificial neural networks
Introduction
Decision making may be seen as a tradeoff between exploitation – maximizing future reward based on past experience, and exploration – getting more information about the environment (Sutton & Barto, 2018). Here we focus on active sensing as a particular form of exploration (Yang, Wolpert, & Lengyel, 2016). Suppose that environment emits observations with probability . If an agent intends to plan and evaluate the informativeness of its actions, it needs to form a probabilistic model of the environment — . A canonical functional to measure, and consequently reduce the mismatch of beliefs, in this case between agent’s model and the real world, is Kullback–Leibler divergence (Cover & Thomas, 2005). The first term does not depend on agent’s model, and thus can be treated as a constant, leaving only the negative log likelihood (NLL) to minimize. Practically, the model may represent a certain structure, or, more often, hyperparameters that define a particular model within a family with a chosen structure. Furthermore, a given model often contains latent (unknown) variables that can be generally classified into the ones that change on shorter (e.g. observation’s hidden causes, denoted by a random variable ) or longer timescales (e.g. model parameters ): . Therefore, the problem of directed exploration can be formulated as gaining information about the latent variables of the model (Fig. 1-a). While active learning (MacKay, 1992, Settles, 2012), is focused on resolving uncertainty about parameters , reflecting the statistical structure of the environment, we focus on active sensing — gaining information about the hidden causes , latent variables that change on a trial-by-trial timescale (Yang, Wolpert et al., 2016). Importantly, the latent variable represents the global context (e.g. layout of a maze/location of reward) which the agent wants to figure out, and local context (e.g. compressed current observation/position within the maze), prior belief over which depends on action (Huys, Guitart-Masip, Dolan, & Dayan, 2015) (Fig. 1-b). Active sensing is an important problem both in pattern recognition (i.e. deciding which features to collect Yu, Krishnapuram, Rosales, & Rao, 2009), and in neuroscience, as the pattern of human eye movements during visual exploration has been shown, to optimize resolution of uncertainty about the underlying context (Friston et al., 2012, Hoppe and Rothkopf, 2019, Yang, Lengyel et al., 2016, Yarbus, 1967).
Previous work on active sensing has been focused on tractable but limited scenarios, using kernel methods (Yu et al., 2009), Gaussian mixture models (Yang, Wolpert et al., 2016) or entirely discrete domains (Friston et al., 2015). Here, we set to investigate the case in which both observations and learned latent representations are continuous. Using the popular linear Gaussian model is not fit for active sensing, since the amount of uncertainty reduction is constant (Bishop, 2006). In contrast, implementing active sensing with an arbitrary nonlinear relationship could be difficult in part because of statistical limitations of information gain estimation. In particular, it has been shown that in the frequent scenario of intractable , unbiased estimates of mutual information estimated from samples cannot be larger than (Mcallester and Stratos, 2020, Poolel et al., 2019).
The main idea of this paper is to rely on the insight that neural networks with rectifying activations implement piecewise linear functions over the input space (Hanin and Rolnick, 2019, Park et al., 2019). Thus, we can both learn flexible representations (Lecun, Bengio, & Hinton, 2015) and compute a sensible (sampling-free) measure of information gain. First, we illustrate that the structure of the relation between and – – has a key role in the potential information gain. Then, we describe how Laplace approximation could be effectively used to quantify information gain in piecewise linear networks, complete the model by specifying dynamics, and apply the approach to an active sensing (saccade simulation) task based on the MNIST dataset.
Section snippets
Materials and methods
Suppose that agent’s beliefs about the next observation , given a hypothetical action , have the following structure: (we omit the time-index and non-essential variables in the conditioning sets for clarity). In active sensing, information gain is quantified as the mutual information (leaving the conditioning on model parameters implicit). A detailed intuition on using in the context of active sensing as well as comparison with
Unbiased information gain estimates
We first validated our approach by comparing it with other popular information gain measures on a benchmark proposed by Belghazi et al., 2018, Poolel et al., 2019: two 20-dimensional random variables , correlated across corresponding dimensions: . Additionally, we implemented a nonlinear problem by following the same setup but applying a transformation in the end (Song & Ermon, 2020). In both cases, the true mutual information is known:
Discussion
Active sensing has been recently studied within the ’planning as inference’ framework (Botvinick and Toussaint, 2012, Levine, 2018) in the context of discrete domains (Friston et al., 2015, Schwartenbeck et al., 2019). In contrast, we focused on continuous states and observations, leveraging the recent advances in probabilistic dynamical models (Chung et al., 2015, Gemici et al., 2017) that were successfully used for building model-based reinforcement learning agents. For example, Hafner et al.
Declaration of Competing Interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Funding sources
This research was funded by Fonds de la Recherche Scientifique (FNRS–FDP) Belgium, IDEX Bordeaux, France and ANR JCJC, France (ANR-18-CE37-0009-01). The funders had no involvement in study design; collection, analysis and interpretation of data; writing of the report; and in the decision to submit the article for publication.
References (54)
- et al.
Deep variational information bottleneck
5th International Conference on Learning Representations, ICLR 2017, Toulon, France, April 24-26, 2017, Conference Track Proceedings
(2017) - Belghazi, M. I., Baratin, A., Rajeswar, S., Ozair, S., Bengio, Y., & Courville, A., et al. (2018). Mutual information...
Pattern recognition and machine learning (information science and statistics)
(2006)- et al.
Regression with input-dependet noise: A bayesian treatment
- et al.
Planning as inference
Trends in Cognitive Sciences
(2012) - et al.
Learning and querying fast generative models for reinforcement learning
(2018) - et al.
Large-scale study of curiosity-driven learning
7th International Conference on Learning Representations, ICLR 2019, New Orleans, LA, USA, May 6-9, 2019
(2019) - et al.
Exploration by random network distillation
7th International Conference on Learning Representations, ICLR 2019, New Orleans, LA, USA, May 6-9, 2019
(2019) - et al.
Learning phrase representations using RNN encoder-decoder for statistical machine translation
- et al.
A recurrent latent variable model for sequential data
(2015)
Elements of information theory
A tutorial on the cross-entropy method
Annals of Operations Research
Active inference and epistemic value
Cognitive Neuroscience
Free-energy minimization and the dark-room problem
Frontiers in Psychology
Generative temporal models with memory
CoRR
Deep sparse rectifier neural networks
Proceedings of the Fourteenth International Conference on Artificial Intelligence and Statistics
Learning to search with mctsnets
Proceedings of the 35th International Conference on Machine Learning, ICML 2018, Stockholmsmässan, Stockholm, Sweden, July 10-15, 2018
Learning latent dynamics for planning from pixels
Proceedings of the 36th International Conference on Machine Learning
Complexity of linear regions in deep networks
Model-based planning with discrete and continuous actions
Multi-step planning of eye movements in visual search
Scientific Reports
Vime: variational information maximizing exploration
Decision-theoretic psychiatry
Clinical Psychological Science
Adam: A method for stochastic optimization
Cited by (5)
Towards reliable uncertainty quantification via deep ensemble in multi-output regression task
2024, Engineering Applications of Artificial IntelligenceNeural Networks special issue on Artificial Intelligence and Brain Science
2022, Neural NetworksAnalysis of Membrane Process Model from Black Box to Machine Learning
2022, Journal of Machine and ComputingDigital Media Teaching and Effectiveness Evaluation Integrating Big Data and Artificial Intelligence
2022, Computational Intelligence and Neuroscience