Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Article
  • Published:

Nicotinic receptors in the ventral tegmental area promote uncertainty-seeking

Abstract

Cholinergic neurotransmission affects decision-making, notably through the modulation of perceptual processing in the cortex. In addition, acetylcholine acts on value-based decisions through as yet unknown mechanisms. We found that nicotinic acetylcholine receptors (nAChRs) expressed in the ventral tegmental area (VTA) are involved in the translation of expected uncertainty into motivational value. We developed a multi-armed bandit task for mice with three locations, each associated with a different reward probability. We found that mice lacking the nAChR β2 subunit showed less uncertainty-seeking than their wild-type counterparts. Using model-based analysis, we found that reward uncertainty motivated wild-type mice, but not mice lacking the nAChR β2 subunit. Selective re-expression of the β2 subunit in the VTA was sufficient to restore spontaneous bursting activity in dopamine neurons and uncertainty-seeking. Our results reveal an unanticipated role for subcortical nAChRs in motivation induced by expected uncertainty and provide a parsimonious account for a wealth of behaviors related to nAChRs in the VTA expressing the β2 subunit.

This is a preview of subscription content, access via your institution

Access options

Rent or buy this article

Prices vary by article type

from$1.95

to$39.95

Prices may be subject to local taxes which are calculated during checkout

Figure 1: Decisions under uncertainty in a mouse bandit task using intracranial self-stimulations.
Figure 2: Model-based analysis of decisions shows motivation for expected uncertainty.
Figure 3: β2*-nAChRs in the VTA affect choices and locomotion.
Figure 4: Model-based analysis reveals a role for VTA β2-nAChR in uncertainty-driven motivation.
Figure 5: β2*-nAChRs affect decision-making under uncertainty in a dynamical foraging task.
Figure 6: New interpretation of behaviors related to VTA nAChRs using the uncertainty model.

Similar content being viewed by others

References

  1. Everitt, B.J. & Robbins, T.W. Central cholinergic systems and cognition. Annu. Rev. Psychol. 48, 649–684 (1997).

    CAS  PubMed  Google Scholar 

  2. Dani, J.A. & Bertrand, D. Nicotinic acetylcholine receptors and nicotinic cholinergic mechanisms of the central nervous system. Annu. Rev. Pharmacol. Toxicol. 47, 699–729 (2007).

    Article  CAS  PubMed  Google Scholar 

  3. Guillem, K. et al. Nicotinic acetylcholine receptor β2 subunits in the medial prefrontal cortex control attention. Science 333, 888–891 (2011).

    Article  CAS  PubMed  Google Scholar 

  4. Rangel, A., Camerer, C. & Montague, P.R. A framework for studying the neurobiology of value-based decision making. Nat. Rev. Neurosci. 9, 545–556 (2008).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  5. Fobbs, W.C. & Mizumori, S.J. Cost-benefit decision circuitry: proposed modulatory role for acetylcholine. Prog. Mol. Biol. Transl. Sci. 122, 233–261 (2014).

    Article  CAS  PubMed  Google Scholar 

  6. Kolokotroni, K.Z., Rodgers, R.J. & Harrison, A.A. Acute nicotine increases both impulsive choice and behavioral disinhibition in rats. Psychopharmacology (Berl.) 217, 455–473 (2011).

    Article  CAS  Google Scholar 

  7. Mendez, I.A., Gilbert, R.J., Bizon, J.L. & Setlow, B. Effects of acute administration of nicotinic and muscarinic cholinergic agonists and antagonists on performance in different cost-benefit decision making tasks in rats. Psychopharmacology (Berl.) 224, 489–499 (2012).

    Article  CAS  Google Scholar 

  8. McGrath, D.S. & Barrett, S.P. The comorbidity of tobacco smoking and gambling: a review of the literature. Drug Alcohol Rev. 28, 676–681 (2009).

    Article  PubMed  Google Scholar 

  9. Schultz, W. Multiple dopamine functions at different time courses. Annu. Rev. Neurosci. 30, 259–288 (2007).

    Article  CAS  PubMed  Google Scholar 

  10. Waelti, P., Dickinson, A. & Schultz, W. Dopamine responses comply with basic assumptions of formal learning theory. Nature 412, 43–48 (2001).

    Article  CAS  PubMed  Google Scholar 

  11. Montague, P.R., Dayan, P. & Sejnowski, T.J. A framework for mesencephalic dopamine systems based on predictive Hebbian learning. J. Neurosci. 16, 1936–1947 (1996).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  12. Berridge, K.C. From prediction error to incentive salience: mesolimbic computation of reward motivation. Eur. J. Neurosci. 35, 1124–1143 (2012).

    Article  PubMed  PubMed Central  Google Scholar 

  13. Maskos, U. et al. Nicotine reinforcement and cognition restored by targeted expression of nicotinic receptors. Nature 436, 103–107 (2005).

    Article  CAS  PubMed  Google Scholar 

  14. Mameli-Engvall, M. et al. Hierarchical control of dopamine neuron-firing patterns by nicotinic receptors. Neuron 50, 911–921 (2006).

    Article  CAS  PubMed  Google Scholar 

  15. Grace, A.A., Floresco, S.B., Goto, Y. & Lodge, D.J. Regulation of firing of dopaminergic neurons and control of goal-directed behaviors. Trends Neurosci. 30, 220–227 (2007).

    Article  CAS  PubMed  Google Scholar 

  16. Daw, N.D., O'Doherty, J.P., Dayan, P., Seymour, B. & Dolan, R.J. Cortical substrates for exploratory decisions in humans. Nature 441, 876–879 (2006).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  17. Frank, M.J., Doll, B.B., Oas-Terpstra, J. & Moreno, F. Prefrontal and striatal dopaminergic genes predict individual differences in exploration and exploitation. Nat. Neurosci. 12, 1062–1068 (2009).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  18. Gittins, J.C. & Jones, D.M. A dynamic allocation index for the discounted multiarmed bandit problem. Biometrika 66, 561–565 (1979).

    Article  Google Scholar 

  19. Scott, P.D. & Markovitch, S. Learning novel domains through curiosity and conjecture. IJCAI (US) 1, 669–674 (1989).

    Google Scholar 

  20. Kaelbling, L.P. Learning in Embedded Systems (MIT Press, 1993).

  21. Meuleau, N. & Bourgine, P. Exploration of multi-state environments: Local measures and back-propagation of uncertainty. Mach. Learn. 35, 117–154 (1999).

    Article  Google Scholar 

  22. Yu, A.J. & Dayan, P. Uncertainty, neuromodulation, and attention. Neuron 46, 681–692 (2005).

    Article  CAS  PubMed  Google Scholar 

  23. Bach, D.R. & Dolan, R.J. Knowing how much you don't know: a neural organization of uncertainty estimates. Nat. Rev. Neurosci. 13, 572–586 (2012).

    Article  CAS  PubMed  Google Scholar 

  24. Oudeyer, P.-Y. & Kaplan, F. What is intrinsic motivation? A typology of computational approaches. Front. Neurorobot. 1, 6 (2007).

    Article  PubMed  PubMed Central  Google Scholar 

  25. Fiorillo, C.D., Tobler, P.N. & Schultz, W. Discrete coding of reward probability and uncertainty by dopamine neurons. Science 299, 1898–1902 (2003).

    Article  CAS  PubMed  Google Scholar 

  26. Schuck-Paim, C., Pompilio, L. & Kacelnik, A. State-dependent decisions cause apparent violations of rationality in animal choice. PLoS Biol. 2, e402 (2004).

    Article  PubMed  PubMed Central  Google Scholar 

  27. Carlezon, W.A. Jr. & Chartoff, E.H. Intracranial self-stimulation (ICSS) in rodents to study the neurobiology of motivation. Nat. Protoc. 2, 2987–2995 (2007).

    Article  CAS  PubMed  Google Scholar 

  28. Kobayashi, T., Nishijo, H., Fukuda, M., Bureš, J. & Ono, T. Task-dependent representations in rat hippocampal place neurons. J. Neurophysiol. 78, 597–613 (1997).

    Article  CAS  PubMed  Google Scholar 

  29. Funamizu, A., Ito, M., Doya, K., Kanzaki, R. & Takahashi, H. Uncertainty in action-value estimation affects both action choice and learning rate of the choice behaviors of rats. Eur. J. Neurosci. 35, 1180–1189 (2012).

    Article  PubMed  PubMed Central  Google Scholar 

  30. Anselme, P., Robinson, M.J.F. & Berridge, K.C. Reward uncertainty enhances incentive salience attribution as sign-tracking. Behav. Brain Res. 238, 53–61 (2013).

    Article  PubMed  Google Scholar 

  31. Sutton, R.S. & Barto, A.G. Reinforcement Learning: an introduction (MIT Press, 1998).

  32. Kakade, S. & Dayan, P. Dopamine: generalization and bonuses. Neural Netw. 15, 549–559 (2002).

    Article  PubMed  Google Scholar 

  33. Herrnstein, R.J. Relative and absolute strength of response as a function of frequency of reinforcement. J. Exp. Anal. Behav. 4, 267–272 (1961).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  34. Ishii, S., Yoshida, W. & Yoshimoto, J. Control of exploitation-exploration meta-parameter in reinforcement learning. Neural Netw. 15, 665–687 (2002).

    Article  PubMed  Google Scholar 

  35. Yeomans, J. & Baptista, M. Both nicotinic and muscarinic receptors in ventral tegmental area contribute to brain-stimulation reward. Pharmacol. Biochem. Behav. 57, 915–921 (1997).

    Article  CAS  PubMed  Google Scholar 

  36. Serreau, P., Chabout, J., Suarez, S.V., Naudé, J. & Granon, S. Beta2-containing neuronal nicotinic receptors as major actors in the flexible choice between conflicting motivations. Behav. Brain Res. 225, 151–159 (2011).

    Article  CAS  PubMed  Google Scholar 

  37. Krugel, L.K., Biele, G., Mohr, P.N., Li, S.-C. & Heekeren, H.R. Genetic variation in dopaminergic neuromodulation influences the ability to rapidly and flexibly adapt decisions. Proc. Natl. Acad. Sci. USA 106, 17951–17956 (2009).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  38. Niv, Y., Edlund, J.A., Dayan, P. & O'Doherty, J.P. Neural prediction errors reveal a risk-sensitive reinforcement-learning process in the human brain. J. Neurosci. 32, 551–562 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  39. Balasubramani, P.P., Chakravarthy, V.S., Ravindran, B. & Moustafa, A.A. An extended reinforcement learning model of basal ganglia to understand the contributions of serotonin and dopamine in risk-based decision making, reward prediction, and punishment learning. Front. Comput. Neurosci. 8, 47 (2014).

    Article  PubMed  PubMed Central  Google Scholar 

  40. Granon, S., Faure, P. & Changeux, J.-P. Executive and social behaviors under nicotinic receptor regulation. Proc. Natl. Acad. Sci. USA 100, 9596–9601 (2003).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  41. Picciotto, M.R. et al. Abnormal avoidance learning in mice lacking functional high-affinity nicotine receptor in the brain. Nature 374, 65–67 (1995).

    Article  CAS  PubMed  Google Scholar 

  42. Maubourguet, N., Lesne, A., Changeux, J.-P., Maskos, U. & Faure, P. Behavioral sequence analysis reveals a novel role for β2* nicotinic receptors in exploration. PLoS Comput. Biol. 4, e1000229 (2008).

    Article  PubMed  PubMed Central  Google Scholar 

  43. Gordon, G., Fonio, E. & Ahissar, E. Emergent exploration via novelty management. J. Neurosci. 34, 12646–12661 (2014).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  44. Payzan-LeNestour, E. & Bossaerts, P. Risk, unexpected uncertainty and estimation uncertainty: Bayesian learning in unstable settings. PLoS Comput. Biol. 7, e1001048 (2011).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  45. Redgrave, P. & Gurney, K. The short-latency dopamine signal: a role in discovering novel actions? Nat. Rev. Neurosci. 7, 967–975 (2006).

    Article  CAS  PubMed  Google Scholar 

  46. Bromberg-Martin, E.S. & Hikosaka, O. Midbrain dopamine neurons signal preference for advance information about upcoming rewards. Neuron 63, 119–126 (2009).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  47. Rice, M.E. & Cragg, S.J. Nicotine amplifies reward-related dopamine signals in striatum. Nat. Neurosci. 7, 583–584 (2004).

    Article  CAS  PubMed  Google Scholar 

  48. Addicott, M.A., Pearson, J.M., Wilson, J., Platt, M.L. & McClernon, F.J. Smoking and the bandit: a preliminary study of smoker and nonsmoker differences in exploratory behavior measured with a multiarmed bandit task. Exp. Clin. Psychopharmacol. 21, 66–73 (2013).

    Article  PubMed  Google Scholar 

  49. Galván, A. et al. Greater risk sensitivity of dorsolateral prefrontal cortex in young smokers than in nonsmokers. Psychopharmacology (Berl.) 229, 345–355 (2013).

    Article  Google Scholar 

  50. Paxinos, G. & Franklin, K.B. The Mouse Brain in Stereotaxic Coordinates (Gulf Professional Publishing, 2004).

  51. Grace, A.A. & Bunney, B.S. Intracellular and extracellular electrophysiology of nigral dopaminergic neurons--1. Identification and characterization. Neuroscience 10, 301–315 (1983).

    Article  CAS  PubMed  Google Scholar 

  52. Rokosik, S.L. & Napier, T.C. Intracranial self-stimulation as a positive reinforcer to study impulsivity in a probability discounting paradigm. J. Neurosci. Methods 198, 260–269 (2011).

    Article  PubMed  Google Scholar 

  53. D'Acremont, M. & Bossaerts, P. Neurobiological studies of risk assessment: a comparison of expected utility and mean-variance approaches. Cogn. Affect. Behav. Neurosci. 8, 363–374 (2008).

    Article  PubMed  Google Scholar 

  54. Behrens, T.E.J., Woolrich, M.W., Walton, M.E. & Rushworth, M.F.S. Learning the value of information in an uncertain world. Nat. Neurosci. 10, 1214–1221 (2007).

    Article  CAS  PubMed  Google Scholar 

  55. Daw, N.D. Trial-by-trial data analysis using computational models. in Decision Making, Affect, and Learning: Attention and Performance XXIII (eds. Delgado, M.R., Phelps, E.A. & Robbins, T.W.) 3–38 (2011).

  56. McClure, S.M., Daw, N.D. & Montague, P.R. A computational substrate for incentive salience. Trends Neurosci. 26, 423–428 (2003).

    Article  CAS  PubMed  Google Scholar 

Download references

Acknowledgements

We thank E. Guigon for discussions, C. Prévost-Solié for technical support, and J.-P. Changeux, E. Ey, G. Dugué and A. Boo for comments on the manuscript. This work was supported by the Centre National de la Recherche Scientifique CNRS UMR 8246, the University Pierre et Marie Curie (Programme Emergence 2012 for J.N. and P.F.), the Agence Nationale pour la Recherche (ANR Programme Blanc 2012 for P.F., ANR JCJC to A.M.), the Neuropole de Recherche Francilien (NeRF) of Ile de France, the Foundation for Medical Research (FRM, Equipe FRM DEQ2013326488 to P.F.), the Bettencourt Schueller Foundation (Coup d'Elan 2012 to P.F.), the Ecole des Neurosciences de Paris (ENP) to P.F., the Fondation pour la Recherche sur le Cerveau (FRC et les rotariens de France, “espoir en tête” 2012) to P.F. and the Brain & Behavior Research Foundation for a NARSAD Young Investigator Grant to A.M. The laboratories of P.F. and U.M. are part of the École des Neurosciences de Paris Ile-de-France RTRA network. P.F. and U.M. are members of the Laboratory of Excellence, LabEx Bio-Psy, and P.F. is member of the DHU Pepsy.

Author information

Authors and Affiliations

Authors

Contributions

J.N. and P.F. designed the study. S.T. and J.N. performed the virus injections. M.D., N.T., G.R. and J.N. performed the behavioral experiments. S.V. and F.M. performed the electrophysiological recordings. S.P. and U.M. provided the genetic tools. J.N., F.M. and P.F. analyzed the data. J.N., A.M. and P.F. wrote the manuscript.

Corresponding author

Correspondence to Philippe Faure.

Ethics declarations

Competing interests

The authors declare no competing financial interests.

Integrated supplementary information

Supplementary Figure 1 Analysis of locomotion in the ICSS-based bandit task

(a) Dwell times (see Methods) were shorter (T(18)=3.67, p=0.002, paired t-test) in the CS than in the US. In US, there were no effects of the reward probability of the target on the dwell times (F(2,18)=0.2, p=0.82, one-way ANOVA). (b) Variation of the instantaneous speed in certain setting: in the CS, the maximal speed of WT mice depended on the ICSS intensity (F(2,18)=13.2, p<0.001, one-way ANOVA) in contrary to what was observed in the uncertainty setting (US) with different probabilities of reward (see Fig 1e).

Supplementary Figure 2 Comparison of models of decision-making and locomotion

(a) Bayesian Information Criterion (BIC) computed using three classical models of action selection ignoring uncertainty (matching law, epsilon-greedy, softmax, see Methods) and two alternative models taking uncertainty into account (softmax model with an uncertainty bonus, or with uncertainty-modulated temperature parameter, see Methods). Smaller BIC value indicates that the uncertainty bonus provided a better fit. (b) BIC derived from multiple linear regression (see Methods) for exploratory locomotion models embedding an increasing number of explicative variables. The red star and crosses indicate the winning model, which incorporate the reward history (R(t-1)) and the expected value (E(R)) and uncertainty (σ2(R)) of the chosen goal (indexed A), but not those of the alternative goal (indexed B).

Supplementary Figure 3 Robustness of the parameters derived from the decision-making model

(a) Comparison between transitions in the last two sessions (#9-10) displayed in the main part of the results, versus transitions measured two sessions before (#7-8). Transitions from these two data sets were not significantly different for all gambles (G1: T(18)=0.44, p=0.67; G2: T(18)=-1.36, p=0.19; G3: T(18)=-1.64, p=0.12, paired t-tests), indicating that the results are stable through the sessions, and that mice decisions reached steady state in this setting. (b, c, d) Proportions of exploitative choices (choice of the most valuable alternative) of the mice for the three gambles in different sets of reward probabilities: {25%, 50%, 75%} (b); {50%, 75%, 100%} (c); {25%, 75%, 100%} (d). (e) Parameters (ϕ and ß) derived from the model-based analysis (uncertainty model) of the transition functions, for the probabilities used in the main text (black) and in the present panels (b, green; c, purple; d, light blue). In each case the uncertainty-seeking parameter was significantly positive, showing that the parameters derived from the model provide a robust characterization (across probabilities sets) of the influence of uncertainty on decision-making process.

Supplementary Figure 4 β2*-nAChRs are not involved in motivation by certain rewards

(a) Learning of the task in the DS with increasing performance along learning sessions for both groups (session effect: F=12.16, p<0.001), which was not different between β2 KO and WT mice (genotype effect: F=0.04, p=0.84, genotype x session interaction: F=0.99, p=0.45, two-way ANOVA). (b) In the DS, the rate of ICSS behavior (number of ICSS per minute) scaled with the intensity of current pulses up to a plateau for both groups (intensity effect: p=<0.001, two-way ANOVA). When tested with different intensities of current pulses, β2KO mice performed the task with the same level of performance (genotype effect: F=1.22, p=0.27, genotype-intensity interaction: F=0.73, p=0.74, two-way ANOVA). (c) In the deterministic setting (DS), when the ICSS intensity increases (I(ICSS) = {40,80,120} µA), the speed profile of β2KO mice is affected, with higher maximal speed (F(2,10)=36.35, p<0.001, one-way ANOVA) at higher intensities of ICSS. (d) Maximal speeds corresponding to the three ICSS intensities ({40,80,120} µA) for β2 KO and WT mice did not differ significantly (genotype effect: F=0,86, p=0,36; genotype x intensity interaction: F=0.16, p=0.86). Note that the values given in (d) do not correspond to the peaks in the speed profiles because the maximum of the average speed profile does not necessarily correspond to the average of the maximal speeds.

Supplementary Figure 5 Analysis of locomotion in β2KO and β2VEC mice

(a) The speed profile of β2KO mice was not significantly modified by the reward probability of the target (F(2,10)=0.08, p=0.93). (b) β2KO mice travelled the same distance whatever the target probability (F(2,10)=0.14, p=0.87), hence the relation between the reward probability of the target place and the cumulative distance travelled was altered in β2KO mice. (c) The speed profiles of β2VEC mice were similar irrespective of the probability of the next reward (F(2,11)=0.21, p=0.81). (d) When going towards less likely ICSS, β2VEC mice tended to travel more (F(2,11)=6.2, p=0.005), showing that β2 nAChRs in the VTA is sufficient to restore the balance of exploiting the task versus exploring the open field.

Supplementary Figure 6 Additional measures of restoration of functional β2*-nAChRs by the lentiviral injection

(a-d) Example of a recorded neuron: (a) Neurobiotine (b) eGFP and (c) tyrosine hydroxylase, identify, respectively, DA cells (green), the neuron re-expressing the β2 subunit (red), and a recorded cell (blue). eGFP, enhanced green fluorescent protein. e) Mean ± s.e.m DA cell firing frequency increase after injection of 30 µg/gk nicotine concentration, in WT (n=46, gray), β2KO (n=20, red) and β2VEC (n=45, black) mice. f) Same for proportion of spike within burst (%SWB). Vertical dashed bar indicates nicotine injection.

Supplementary Figure 7 Model comparison and robustness in β2KO and β2VEC mice

(a,b) Bayesian Information Criterion (BIC) computed using the four models of action selection (matching law, epsilon-greedy, softmax, softmax with an uncertainty bonus, see Methods) for (a) β2KO mice, (b) β2VEC mice. In each case, the uncertainty model provided smaller BIC, which indicates better fit. (c, d, e) Proportions of exploitative choices (choice of the most valuable alternative) of β2KO mice for the three gambles in different sets of reward probabilities: {25%, 50%, 75%} (c); {50%, 75%, 100%} (d); {25%, 75%, 100%} (e). (f) Parameters derived from the model-based analysis (uncertainty model) of the transition functions of β2KO mice, for the probabilities used in the main text (black) and in the present panels (b, green; c, purple; d, light blue). The model parameters did not significantly differ between probability sets (for ϕ, F(3,37)=0,32; p=0,81; for β, F(3,37)=0,26; p=0,85).

Supplementary Figure 8 Learning phase in the probabilistic task: experimental data and model comparison

(a,b) Evolution of the proportion of choices of the three rewarded locations in the uncertain setting, across the learning sessions, for WT (a) and β2KO (b) mice. (c,d) Difference in Bayesian information criterion (compared to the standard RL model) of models including an expected uncertainty bonus (“uncertainty”), an adaptive learning rate (“adaptive LR”) and an unexpected uncertainty bonus, for WT (c) and β2KO (d) mice. (e,f) Model fits of the experimental data shown in (a,b) for the winning models, i.e. expected uncertainty for WT mice, and standard model for β2KO mice.

Supplementary Figure 9 Model comparison in the dynamic foraging task

(a) Computational models of reinforcement-learning and decision-making used to analyze the behavioral data, summarizing whether sensitivity to uncertain outcomes arises from learning, decision, or both processes. (b,c) Bayesian Information Criterion (BIC) for the standard reinforcement learning model and alternative models: standard model with asymmetric learning (L) rates for positive and negative outcomes, uncertainty model with a single learning rate for value and uncertainty (bonus), uncertainty model with separate learning rates for value and uncertainty, uncertainty model with three learning rates (for positive and negative outcomes, and for uncertainty). Smaller BIC value indicates better fit, which was the uncertainty model with separate learning rates for value and uncertainty for WT mice (b) and the standard reinforcement learning model for β2KO mice (c).

Supplementary Figure 10 Alternative models for the spatial learning and passive avoidance tasks

(a) Variations of the temperature parameter (ß) in the simulation of the spatial learning task using the standard reinforcement-learning model. Original experimental data are represented (mean ± sem) by dots (black for WT, red for β2KO). The curves represent the modeling of the data with an increased value of ß (from top to bottom, black to dark blue). (b) Variations of the initial value (V0) of the rewarding arm in the simulation the spatial learning task using the standard reinforcement-learning model. Same presentation as (a). (c) Variations of the learning rate (α) in the simulation the spatial learning task using the standard reinforcement-learning model. Same presentation as (a). (d) In the simulation the spatial learning task using the standard reinforcement-learning model, combined modifications of initial value and learning rate hardly explain the WT data. Data are shown as dots with error bars (mean + s.e.m), simulation as stripes. (e) Variations of the temperature parameter (ß) in the simulation the passive avoidance task using a sequential reinforcement-learning model. (Same presentation as (a). (f) Variations of the baseline activity (θ) in the simulation the passive avoidance task using a sequential reinforcement-learning model. (Same presentation as (a). Data in (a-d) adapted with permission from Ref 42. Data in (e,f) adapted with permission from Ref 43.

Supplementary Figure 11 Model simulation: open-fields without rewards and object recognition

(a) Decomposition of behavior in an open-field. Locomotion in the open field is transformed into four states, resulting from the differentiation between active (A) or inactive (I) states (depending on the velocity) and periphery (P) or center (C) zones. (b) Discretized representation of the behavior based on the four-states decomposition, used for model simulation. Possible transitions are represented by plain arrows and forbidden transition by dashed arrows. (c,d) Simulation of transition probabilities between “center-active” (CA) and “center-inactive” states (c), and between “periphery-active” and “center-active” (d), for WT (black, model with uncertainty bonus) and β2KO (red, model without uncertainty bonus) mice. (e) Simulation of total time spent in inactive states (PI and CI) for WT (black) and β2KO (red) mice. (f) Object recognition in an open-field. Two states represent the object areas, the rest of the open-field is modeled as 25 discrete states. (g) Total time spent in the “object areas” states for WT (black, model with uncertainty bonus) and β2KO (red, model without uncertainty bonus) mice. Data in (c- e) adapted with permission from Ref 13. Data in (g) adapted with permission from Ref 42.

Supplementary information

Supplementary Text and Figures

Supplementary Figures 1–11 (PDF 2272 kb)

Supplementary Methods Checklist

(PDF 494 kb)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Naudé, J., Tolu, S., Dongelmans, M. et al. Nicotinic receptors in the ventral tegmental area promote uncertainty-seeking. Nat Neurosci 19, 471–478 (2016). https://doi.org/10.1038/nn.4223

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1038/nn.4223

This article is cited by

Search

Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing