Predict future decisions with brain signals

The human brain plans and controls decision-making behavior; therefore, brain activity contains information that could predict future decisions. Linking brain activation patterns with behavioral choices provides invaluable knowledge about the nature of human decision-making (e.g., Cohen & Ranganath, 2007; Knutson, Rick, Wirnmer, Prelec, & Loewenstein, 2007). In the context of risk decision-making (i.e., different options vary in their risk levelsFootnote 1), the brain activity that predicts upcoming risk decision may help unravel the psychological mechanisms of risk taking. For instance, the predictive power of reward-related brain regions (e.g., orbitofrontal cortex [OFC] and basal ganglia) indicates that risk decision-making is strongly reward driven (Bechara, Damasio, & Damasio, 2000; Christopoulos, Tobler, Bossaerts, Dolan, & Schultz, 2009; Smith et al., 2009). Taking a step further, researchers have found that the abnormality of these reward-related regions leads to maladaptive risk taking in clinical samples (e.g., pathological gambling, drug abuse, unprotected sex Bolla et al., 2003; Robbins, Gillan, Smith, de Wit, & Ersche, 2012). Thus, the neuroscience research on predicting risky choices is not only theoretically interesting but also practically important.

Various neuroscience techniques have been used to predict upcoming risk decision-making in humans, especially brain-imaging and event-related-potential (ERP) methods. Brain-imaging research has identified a wide range of brain regions that are strongly involved in risk decision-making, including the midbrain dopaminergic system, such as the ventral striatum, and frontal areas, such as the medial prefrontal cortex (MPFC) (Hsu, Bhatt, Adolphs, Tranel, & Camerer, 2005; Krain, Wilson, Arbuckle, Castellanos, & Milham, 2006; Levy, Snell, Nelson, Rustichini, & Glimcher, 2010). In general, the dopaminergic system focuses on reward processing, while the frontal areas estimate potential risk and/or cost of action. Human decision-making may depend on the balance between these two systems (Rushworth & Behrens, 2008). Indeed, the activities of both systems show reliable predictive power in risk decision-making tasks. Signal changes in the ventral striatum significantly predict the willingness to take economic risk (e.g., gamble; Buchel, Brassen, Yacubian, Kalisch, & Sommer, 2011; Huang, Soon, Mullette-Gillman, & Hsieh, 2014; Niv, Edlund, Dayan, & O’Doherty, 2012). Regarding frontal areas, the MPFC not only encodes the difference in risk level between options but also shows increased activity for risky compared with safe decisions (Wright, Symmonds, & Dolan, 2013). Meanwhile, the importance of the insula, anterior cingulate cortex (ACC), and parietal cortex should not be underestimated (Kolling, Wittmann, & Rushworth, 2014; Mohr, Biele, & Heekeren, 2010; Platt & Huettel, 2008; Smith et al., 2009). However, hemodynamic-based brain-imaging techniques suffer from limited temporal resolution (Deshpande, LaConte, Peltier, & Hu, 2009). Therefore, it is difficult to judge whether the discriminative brain activations between risky and safe trials reflect the neural activities that precede and determine the decision, or instead reflect the consequence of that decision (Helfinstein et al., 2014). For the same reason, brain-imaging techniques have difficulties in dissociating the psychological functions that are highly overlapped in time domain (Amodio, Bartholow, & Ito, 2014). Considering that temporally adjacent cognitive functions (e.g., attention, mental representation, and information integration) might be involved in risk decision-making (Brand, Labudda, & Markowitsch, 2006; Weber & Johnson, 2009), this study employs the ERP as the brain recording technique in appreciation of its exquisite temporal resolution (Liotti, Woldorff, Perez, & Mayberg, 2000).

ERP approach for prediction

Decision-making can be divided into multiple stages; generally, people evaluate available options (option assessment), make choices (behavioral output), and then receive outcome feedback (feedback processing) (Paulus, 2005; Platt, 2002). To our knowledge, most previous ERP studies make behavioral predictions by investigating the feedback processing stage, that is, using feedback-evoked signals in one trial to predict behavioral decisions in the next trial (e.g., Cohen & Ranganath, 2007; Zhang et al., 2013). This between-trial analysis reflects feedback learning, which is based on the rationale that participants would learn from feedback and adjust behavioral strategies accordingly, even though adjacent trials are independent from each other (Cohen & Ranganath, 2005; Peterson, Lotz, Halgren, Sejnowski, & Poizner, 2011; Starcke & Brand, 2016).

In light of this idea, previous studies have found that the amplitudes of the ERP components elicited by current feedback, particularly the feedback-related negativity (FRN) and P3, are associated with risk decision in the subsequent trial (Cohen & Ranganath, 2007; San Martín, Appelbaum, Pearson, Huettel, & Woldorff, 2013; Zhang et al., 2014; Zhang et al., 2013). In a strategic gambling game, Cohen and Ranganath (2007) discovered that a heightened FRN response elicited by monetary loss indicates a stronger tendency to select a different option on the next trial, which is interpreted from the perspective of trial-by-trial reward learning (see also Cavanagh, Frank, Klein, & Allen, 2010; Cohen, Wilmes, & van de Vijver, 2011; van der Helden, Boksem, & Blom, 2010). Also, our previous studies have reported that when the P3 amplitude elicited by feedback became larger, participants would tend to switch between risk seeking and risk avoidance in adjacent trials (Zhang et al., 2014; Zhang et al., 2013). These findings, based on between-trial ERP analyses, reveal that feedback information contributes to future risk preference. The electroencephalogram (EEG) approach of between-trial research also supports this viewpoint, generally. For instance, Pedroni et al. (2017) examined the spontaneous EEG during intertrial intervals in a sequential risk-taking task and found an EEG measure (i.e., the percentage of time covered by EEG microstates) that could reflect the influence of current feedback on subsequent risk preference.

Between-trial analysis focuses on how an outcome of one decision influences another decision in the next trial, which mainly uncovers the mechanism of feedback learning (Cohen et al., 2011; Zhang et al., 2013). To reach a more comprehensive understanding of risk decision-making, researchers should also directly establish a predictive relationship between option assessment and behavioral output within the same trial (i.e., within-trial analysis). During the stage of option assessment, people evaluate the potential reward and risk associated with each option and make a trade-off between these two aspects (Weber & Johnson, 2008). These cognitive functions are beyond the framework of feedback learning. Within-trial ERP analysis could help us to observe how these functions shape risk preference and how they turn into actions. To our knowledge, limited progress has been made in this direction. One exception is a study conducted by Cohen, Elger, and Fell (2009), which focused on the EEG data and found that during option assessment, the EEG power enhanced within the delta (1–4 Hz) and theta (4–8 Hz) frequency bands. Nevertheless, Cohen et al. (2009) did not detect any association between the EEG and behavioral choices. In our opinion, a difficulty for this line of research is to ensure that the electrophysiological signals of interest are time locked to specific psychological functions, which is a basic requirement for both ERP and event-related EEG research (Luck, 2005). However, in most decision-making paradigms used in ERP/EEG studies, the available options, as well as the given circumstance, are unchanged throughout the task (e.g., Gehring & Willoughby, 2002). As a result, participants may assess options and make up their minds before the time point of option presentation (Cohen et al., 2011).

Balloon Analogue Risk Task (BART)

We suggest that the Balloon Analogue Risk Task (BART), a classic paradigm of sequential decision-making (Lejuez et al., 2002), would be suitable for the ERP-based within-trial prediction. Sequential decision-making means that people should consecutively make multiple decisions within the same trial (Pleskac, Wallsten, Wang, & Lejuez, 2008). In each trial of the BART, participants could pump up for many rounds an empty balloon, which may grow larger incrementally or end up exploded. After a successful pump, the balloon would be inflated to a greater volume (we call it periodical outcome), indicating that the accumulated reward increases. Participants then choose between cashing out that reward (a safe option) and continuously betting for more reward (a risky option). Thus, a positive outcome in the current round also represents the value of the safe option in the next round, that is, the reward that participants would receive if they discontinue. We are interested in using the ERPs elicited by periodical outcome (also the safe option of the next round) to predict the upcoming decision within the same trial. For two reasons, we believe that it would be promising to build a time-locked relationship between the ERPs evoked by periodical outcome and the stage of option assessment. First, participants do not know whether they would get another opportunity to make a decision until they receive the periodical outcome. Second, both the amount of accumulated reward and the level of economic risk vary across different decision rounds; thus, the information about reward and risk is dynamic and needs real-time evaluation. In short, the BART provides an ideal task structure that is suitable for the within-trial behavior prediction using the ERPs time locked to option assessment.

An abundance of research has been devoted to explore the neural activations associated with the BART, which undoubtedly illuminates this study. First, compared with a passive no-choice condition in which participants continually inflate a balloon until it explodes, active engagement in the BART results in greater activations in the dorsolateral prefrontal cortex (DLPFC), possibly reflecting the cognitive control mechanism (Cazzell, Li, Lin, Patel, & Liu, 2012; Lin, Li, Cazzell, & Liu, 2014; Rao, Korczykowski, Pluta, Hoang, & Detre, 2008; L. L. Rao et al., 2014). Second, many brain regions respond to the risk level associated with each pump, including the ACC, MPFC, OFC, insula, and the midbrain reward processing system (Kohno, Morales, Guttman, & London, 2017; Mata, Hau, Papassotiropoulos, & Hertwig, 2012; Rao et al., 2008; Schonberg et al., 2012; Telzer, Fuligni, Lieberman, & Galvan, 2013a, 2013b; Telzer, Fuligni, Lieberman, Miernicki, & Galvan, 2015). Some of these regions are helpful to explain individual difference in risk preference during the BART. For example, participants who showed stronger activation in the ventral striatum were less likely to cash out (Telzer et al., 2015). Poor sleep quality in adolescences leads to more risky decisions and reduced functional coupling between the DLPFC and ventral striatum (Telzer et al., 2013a). These findings suggest the importance of individual striatal dopamine variability in the BART, which is also supported by studies on genetic difference in dopaminergic systems (Mata et al., 2012). In addition, Jentsch, Woods, Groman, and Seu (2010) found that temporary inactivation of the OFC significantly reduced the tendency of risk taking; to explain this behavioral change, Jentsch et al. (2010) proposed that the OFC is crucial for the incentive motivation to obtain reward.

Previous ERP studies have also revealed that the amplitudes of the FRN and P3 are sensitive to the factors of reward and risk in the BART. Regarding the reward factor, both the FRN and P3 are larger following negative (i.e., balloon explosion) compared with positive feedback (Crowley et al., 2009; Euser, van Meel, Snelleman, & Franken, 2011; Yau, Potenza, Mayes, & Crowley, 2015). Regarding the risk factor, a series of studies conducted by Euser and her colleagues show that the P3 amplitude elicited by positive feedback is correlated with the total number of risky choices (Euser, Evans, Greaves-Lord, Huizink, & Franken, 2013a; Euser, Greaves-Lord, et al., 2013b; Euser et al., 2011). Furthermore, Kardos et al. (2016) found that the reward positivity (which is argued to be analogous to the FRN; see Proudfit, 2015, for a review) elicited by positive feedback was larger preceding a safe decision compared with a risky but failed (i.e., balloon explosion) decision. These findings support the roles of the FRN and P3 in feedback processing on the BART when outcome processing responses are averaged across trials. Furthermore, an investigation of within-trial outcome processing on the BART by Kiat, Straley, and Cheadle (2016) has also shown that the amplitude of the P2 and FRN elicited by outcomes increases as a function of risk level as participants progress through each BART trial.

Unfortunately, most previous BART studies did not examine the relationship between brain signals and behavioral decisions on a within-trial basis, that is, whether brain activations elicited by periodical outcome could predict the decision in a following round. A notable exception was performed by Helfinstein et al. (2014), who used functional magnetic resonance imaging (fMRI) and discovered that the ACC, insula, and OFC are predictive of behavioral choices (pump/cash out) in the upcoming round. However, as we mentioned earlier, brain-imaging techniques generally have limited temporal accuracy. Therefore, it is difficult to distinguish the roles of these regions in risk decision-making, that is, to isolate their activations in temporal dimension and associate them with different cognitive functions.

Experimental hypotheses

The current study investigated the predictive validity of the ERPs elicited by periodical outcomes immediately preceding either risky or safe choices. A modified version of BART was used to fit our research interest (see the Method section for details). On the behavioral level, we expected that as decision rounds accumulate, the tendency to choose the risky option would significantly decline because of the increasing risk level (Bishara et al., 2009). On the electrophysiological level, we expected that ERPs that are time locked to the periodical outcome could predict subsequent decision-making (safe vs. risky) on the single-trial level. The most likely candidates are the FRN and P3. However, since it is unknown whether the ERPs following periodical outcome could predict future decisions in a single-trial setup, no specific hypothesis was made regarding the ERP components.

Method

Participants

Forty-four healthy participants (24 females; age range: 18–24 years, mean = 20.2 ± 0.3 years) were recruited from Beijing Normal University as paid participants. All participants were right-handed and had normal or corrected-to-normal vision. Written informed consent was obtained prior to the experiment. The experimental protocol was approved by the local ethics committee (Beijing Normal University), and this study was performed strictly in accordance with the approved guidelines.

Procedure

Before the experiment, participants were instructed about the rules and the meaning of the symbols in the task (see Fig. 1 for the experimental instruction). In brief, there were at most three rounds for each trial, and participants were asked to choose between “pump” or “cash out” in each round. Participants were told that the probability of balloon inflation versus explosion was 50% versus 50% in every round. They were also encouraged to respond in any way that would maximize the total score amount. The higher the scores they earned, the more bonus money they would receive at the end of the experiment.

Fig. 1
figure 1

Graphical instruction for participants prior to the experiment. Participants were told: “In this task, you can pump up each balloon up to three times. You will make 5, 10, or 20 points being added to the total scores if the first, the second, or the third pump is successful. The probability of successful and unsuccessful pump is 50% vs. 50%. If the balloon explodes, you will get zero points for this balloon. If you stop pump and cash out the current points, the accumulated points will be added to the total scores immediately.” (Color figure online)

During the formal task, participants sat comfortably in an electrically shielded room, approximately 100 cm from a computer screen. The sequence of stimulus presentation is shown in Fig. 2. Each trial represented a simulated red balloon (3.0° × 3.5° of visual angle at the starting of any trial) presented on the center of the screen. In each round, there was a black question mark inside the balloon, indicating that participants could incrementally inflate the balloon or stop the current trial and cash out the points by pressing the “F” or “J” key on the keyboard with their left or right index finger, respectively. The relationship between two options and button positions were counterbalanced across participants. Choosing the “pump” option would result in the disappearance of the question mark for a random interval of 800 to 1,200 ms. Thereafter, if participants succeeded, the balloon would become larger such that its on-screen size increased by 5%, meanwhile the amount of current accumulated rewards appeared inside the balloon with a plus sign. Participants then waited for 1,500 ms until the question mark was presented again, indicating the beginning of next round. Otherwise, the appearance of an explosive symbol replaced the balloon for 1,500 ms to provide a symbolization of exploded balloon, which indicated that the accumulated rewards were lost. Participants then saw the capitalized word “NEXT” for 1,500 ms, indicating the upcoming of a new trial. Choosing the “cash out” option would result in the presentation of “NEXT” immediately, with the accrued points being added to the total scores. All the numbers and characters presented in the task were written in Arial font in boldface. The experiment contained five identical blocks; each block consisted of 90 trials (i.e., 90 separate balloons), with each block separated by a short break. At the end of the task, the total earned points were displayed to participants. Participants were then debriefed and were paid 60–100 Chinese yuan according to their final scores. Stimulus display and behavioral data acquisition were conducted using E-Prime 2.0 (Psychology Software Tools, Inc., Pittsburgh, PA, USA).

Fig. 2
figure 2

Illustration of experimental trials, which describes four possible consequences (corresponding to the four rows) if participants kept pumping up the balloon, that is, the balloon might end up exploding in Rounds 1, 2, or 3, or might finish as a 20-point reward. The ERP data in blue boxes were used to predict the decision in the upcoming round. All the four kinds of trials ended up with a word “NEXT,” which indicated a new trial was coming. (Color figure online)

Overall, the structure of formal task was consistent with the classic BART. While retaining the key features of sequential risk taking, we also made modifications in three aspects to fit the purposes of this study.

First, the classic BART is constructed to allow a total of 128 possible pumps per trial (Pleskac, 2008). In order to analyze neural responses in each round separately, the maximum round number per trial was shortened to three pumps to ensure that each round had sufficient epochs available for ERP averaging while preventing the task from being too lengthy. Similar changes have been widely applied in other neuroscience studies using the BART (Fein & Chang, 2008; Helfinstein et al., 2014).

Second, the BART is an ill-defined problem because participants are unaware of the chances of explosion (e.g., Pleskac, 2008). Participants need to explore the concealed probabilistic structure of the paradigm through trial-and-error learning. However, the ill-defined characteristic may reduce the BART’s ability to identify real-world risky takers (Pleskac, 2008). In order to focus on subjective risk preference, the modified task was designed as a well-defined problem to control the potential influence of feedback learning, such that both the expected payoff of each pump and the probability of balloon explosion were informed to participants prior to the task. Similar revisions have been applied by Pleskac (2008), who compared two conditions such that probability information was unknown in one condition but was explicit in the other. According to Pleskac (2008), learning demands were minimized in the explicit condition. Indeed, his results revealed that task performance became more sensitive to real-world risk taking (e.g., drug abuse) in the explicit condition (see also Wallsten, Pleskac, & Lejuez, 2005).

Third, the probability of explosion in the classic BART increases with each successful pump (Pleskac, 2008). As a result, the classic BART has an expected value that first increases to a peak and then drops back down with continued pumping (Leland & Paulus, 2005). Thus, the pump option is the optimal choice at the early stage of every trial but gradually becomes suboptimal as the number of successfully inflating accumulates. This task design confounds increased risk with varying expected value (Schonberg, Fox, & Poldrack, 2011). We modified the task such that the probabilities of win and loss were equal throughout the task (see also Yau et al., 2015, for similar modifications). The reward in the balloon was zero at the beginning of each trial (consistent with the classic BART), and increased from zero to 5 points after the first, from 5 to 10 points (i.e., increased by 5 points) after the second, and from 10 to 20 points (i.e., increased by 10 points) after the third successful pump. Accordingly, participants in the second pump faced a trade-off between earning 5 more points and losing 5 points at hand; likewise, in the third pump they faced a trade-off between earning 10 more points and losing 10 points at hand. Thus, the expected values of both the second pump (5 × 1/2 + (−5) × 1/2) and the third pump (10 × 1/2 + (−10) × 1/2) were zero (i.e., there was no optimal choice in the scenario). In other words, the major difference between these two rounds was risk level, but not expected value.

Behavioral measure

This study introduced a dependent variable of “risky ratio” to measure the preference for risky or safe (risk-avoidant) options (Zhang et al., 2014; Zhang et al., 2013). Risky ratio was calculated by dividing the number of risky decisions (i.e., pump up the balloon) by the total number of decisions (pump + cash out). Response time (RT) for each decision was also analyzed.

EEG recording and data preprocessing

EEG activity was recorded referentially against a frontal electrode site and off-line rereferenced to the global average reference, by a 64-channel amplifier with a sampling frequency of 250 Hz (NeuroScan Inc., Herndon, VA, USA). Besides two electrooculogram channels, 62-channel EEG data were collected with electrode impedances kept below 5 kΩ.

Ocular artifacts were removed from the EEG using a regression procedure implemented in NeuroScan software (Scan 4.3). The recorded EEG data were then filtered (0.01–30 Hz) and segmented beginning 200 ms prior to the onset of stimulus and lasting for 1,600 ms. All epochs were baseline-corrected with respect to the mean voltage over the 200 ms preceding the onset of stimulus, followed by averaging in association with experimental conditions.

ERP analysis and single-trial prediction

This study used the ERPs elicited by periodical outcome in Round 1 and Round 2, to predict behavioral choices in Round 2 and Round 3, respectively (see the two blue rectangles in Fig. 2). Only the ERPs following positive periodical outcome (i.e., successfully pump) are considered in the analysis, because after negative outcome (i.e., explosion) no further action could be taken due to the task structure (see also Helfinstein et al., 2014).

Through visual detection on the averaged ERP waveform, we found out that the P1 and P3 components were sensitive to the difference between safe and risky trials. In contrast, no clear FRN was observed in the frontal area. Accordingly, the P1 and P3 components were chosen for further analyses. The selection of these components was data-driven.

First, we analyzed the mean amplitudes of the lateral occipital P1 and the parietal P3 across different sets of electrodes. Time windows for mean amplitude calculation were centered at the peak latencies of ERP components in grand-mean waveforms, with a shorter window length for early components and a longer length for late components. Through visual detection on grand-mean ERP waveforms, the peak latency of the P1 was also considered, because there was apparent difference between safe and risky trials. The mean amplitude and peak latency of P1 were measured using the average data at the electrode sites of O1, O2, PO3, PO4, PO5, PO6, PO7, and PO8 (analysis window: to pump in the next round = 80–130 ms, to cash out in the next round = 120–170 ms, after the onset of periodical outcomes). Since the P1 component has been scarcely reported in decision-making literature, this study did not make any hypothesis concerning the hemisphere distribution of the P1. Accordingly, the P1 measures in the left and right hemispheres were collapsed to obtain a high signal-to-noise ratio (see also Eldar et al., 2012; Hammerschmidt, Sennhenn-Reulen, & Schacht, 2017; Liu, Zhang, & Luo, 2014; Raz, Dan, & Zysberg, 2014; Zhang, He, Chen, & Wei, 2016). Meanwhile, the mean amplitude of P3 was measured using the average data at the electrode sites of Cz, CPz, Pz, CP1, and CP2 between 300 and 450 ms after outcome onset.

Second, we used single-trial ERPs elicited by the periodical outcome in the current round to predict whether participants would choose risky (pump) or safe (cash out) options in the next round. To obtain a robust measurement of the spatial-temporal information buried in single-trial ERPs, we used the average activity across short intervals (60 ms in length with a sliding step size of 4 ms; i.e., neighboring 60-ms intervals had an overlap of 56 ms) as classification features (Blankertz, Lemm, Treder, Haufe, & Muller, 2011; Mars et al., 2008; Philiastides, Ratcliff, & Sajda, 2006; Steinhauser & Yeung, 2010). Please note that a relatively short interval (60 ms) was used in the single-trial analysis because a narrow time window could reflect temporal changes of ERP activity more accurately. Considering that the feedback-evoked ERP epochs had a length of 1,600 ms in this study, we finally had 386 short intervals as prediction data ((1600 − 60)/4 + 1 = 386). Each interval had 62 mean amplitudes from the 62 recording electrodes. Then, the 62-dimensional feature vector was put into logistic regression (LR) classifiers to label the behavioral decision of the subsequent choice as “pump” or “cash out.” The LR classifier has been widely applied in the field of cognitive neuroscience (e.g., Bhattacharyya, Konar, Tibarewala, & Hayashibe, 2017; De Lucia & Tzovara, 2015; Fischer & Ullsperger, 2013; Munneke, Nap, Schippers, & Cohen, 2015; Tzovara, Chavarriaga, & De Lucia, 2015; van Gerven, Hesse, Jensen, & Heskes, 2009).

To ensure that the predictive ERP measures have validity across individuals, this study used a between-subject cross-validation strategy (Kriegeskorte, Simmons, Bellgowan, & Baker, 2009; Pereira, Mitchell, & Botvinick, 2009). To that end, we trained the LR classifier using eight-fold cross-validation to discriminate between short-interval activity maps of the BART rounds that immediately preceded a decision to cash out versus a decision to continuously inflate the balloon. To balance the number of ERP epochs that preceded pump versus cash out (i.e., two classes), we randomly selected a subset of the ERP short intervals (i.e., samples) in the class with more numerous samples and used the subset as training data. To ensure against variability due to subset selection, the procedure was performed 10 times in each training set, and the results were averaged.

To evaluate the performance of the classifiers, the receiver operating characteristic (ROC) curve was plotted, and the area under the curve (AUC) was calculated to quantify the prediction results. A permutation statistical analysis was performed to test whether the achieved AUC values exceeded chance level. The permutation procedure was applied to calculate the 95% confidence interval (CI) of the AUC under the null hypothesis (i.e., there is no predictive information buried in ERP short intervals) by producing a distribution of the AUC with label-permuted trials (repeated 1,000 times). We then checked whether the AUC given by LR classifiers was outside of the 95% CI of the associated label-permuted distribution, in which case we determined that the AUC produced by the LR classifier achieved a significance level of p < .05.

Statistics

Descriptive data were presented as mean ± standard error. The significance level was set at .05.

A repeated-measures analysis of variance (ANOVA) was performed on the behavioral measure of risky ratio as well as ERP measurements, with risk decision (pump vs. cash out) and round number as within-subject factors. The factor of round number had three levels for behavioral data (Rounds 1, 2, and 3) while it had two levels for ERP data (using the ERPs elicited by Round 1 outcome to predict the Round 2 decision, and using the ERPs elicited by Round 2 outcome to predict the Round 3 decision; the ERPs elicited by Round 3 outcome was not analyzed, because no further decision could be taken after Round 3).

For the analysis of response time (RT), the generalized linear mixed model (GLMM; Lo & Andrews, 2015) was used because of the unbalanced nature of the data (i.e., one participant, #27, never pumped up balloons in Round 3, and 13 participants never cashed out in Round 1). The GLMM was configured with intercept, risk decision (pump vs. cash out), round number, and Choice × Round interaction as four fixed factors, and subject as a random factor. As suggested by Lo and Andrews (2015), the RT distribution was considered as inverse Gaussian (identity link function between the predictors and RT).

Greenhouse–Geisser correction for ANOVA tests was used whenever appropriate. Accordingly, for all the F tests with df > 1 in the numerator, both the corrected p value and the Greenhouse–Geisser epsilon value (ε) are reported. Post hoc testing of significant main effects was conducted using Bonferroni method. Significant interactions were analyzed using simple effects model.

Results

Behavioral results

Risky ratio

The numbers of risky versus safe decisions in each participant are shown in Table 1. The data from all the 44 participants were used to statistically analyze the measure of risky ratio. On average, there were 438.5 ± 4.04 “pump” trials and 11.5 ± 4.04 “cash out” trials in Round 1; 170.1 ± 5.43 “pump” trials and 49.6 ± 5.28 “cash out” trials in Round 2; and 33.1 ± 3.36 “pump” trials and 52.1 ± 2.40 “cash out” trials in Round 3.

Table 1 Trial number in the three rounds of BART for each participant

The risky ratio was 97.4 ± 0.9 %, 77.4 ± 2.5 %, and 36.4 ± 3.2 % in Round 1, 2, and 3, respectively (see Fig. 3a). A one-way repeated-measures ANOVA revealed a significant effect of round number, F(2, 86) = 235, p < .001, ε = 0.934, \( {\eta}_p^2 \)= 0.845. Pairwise comparisons showed that the risky ratio significantly reduced from Round 1 to Round 2, and from Round 2 to Round 3 as well (ps < .001).

Fig. 3
figure 3

Behavioral results. a Risky ratio in the three rounds. b Response time in different conditions. Star indicates p < .05 in pairwise comparisons. Light red for the RT in Round 1 indicates that the measure should be cautiously interpreted due to its relatively low reliability (see Limitations in the Discussion section). (Color figure online)

Response time

The trial number in the three rounds of BART for each participant was listed in Table 1.

The GLMM (Akaike corrected = 3,152, Bayesian = 3,180) demonstrated that a main effect of round number, F(2, 244) = 8.88, p < .001, was significant. The RT in Round 1 (763 ± 157 ms) was longer than that in Round 2 (660 ± 156 ms, p < .001) and Round 3 (688 ± 157 ms, p = .012).

Most importantly, a significant interaction of Risk Decision × Round Number was detected, F(2, 244) = 39.1, p < .001 (see Fig. 3b). The RT for the cash-out decision (891 ± 160 ms) was longer than that for the pump decision (635 ± 156 ms) in Round 1, F(1, 244) = 37.0, p < .001, but this tendency was reversed in Round 2, F(1, 244) = 7.41, p = .007, pump RT = 694 ± 157 ms, cash-out RT = 627 ± 156 ms, and Round 3, F(1, 244) = 38.5, p < .001, pump RT = 817 ± 161 ms, cash-out RT = 558 ± 156 ms.

ERPs

In this section, we did not include the participants who had less than 15 trials in one or more conditions, because these ERP data had a relatively low signal-to-noise ratio. Consequently, the final sample for ERP analysis consisted of 32 participants (refer to Table 1).

P1 component

Mean amplitude

The main effect of risk decision was significant, F(1, 31) = 31.6, p < .001, \( {\eta}_p^2 \)=0.505 (see Fig. 4). The P1 evoked by the current outcome showed larger amplitudes when participants decided to pump up balloons (2.79 ± 0.19 μV), compared with cash out (1.34 ± 0.22 μV), in the next round. The main effect of round number was not significant, F(1, 31) < 1, \( {\eta}_p^2 \)= 0.002. The interaction of Risk Decision × Round Number was not significant, F(1, 31) < 1, \( {\eta}_p^2 \)= 0.001.

Fig. 4
figure 4

Lateral occipital P1 component (n = 32). Data were averaged at the electrode sites of O1, O2, PO3, PO4, PO5, PO6, PO7, and PO8 (magenta circles in topographies). To ensure a relatively low signal-to-noise ratio, individual ERPs that contained less than 15 trials in one or more conditions were excluded, resulting in a final sample of 32 participants. Here, ERP time locked to the onset of the outcome presentation in the current decision round is conditionalized by the decision (pump or cash out) in the next round (e.g., waveform displayed by a red solid line shows the averaged ERP epochs that were time locked to the outcome of Round 1 and were followed by a “pump” decision in Round 2). (Color figure online)

Peak latency

The main effect of risk decision was significant, F(1, 31) = 668, p < .001, \( {\eta}_p^2 \)= 0.956 (see Fig. 4); the P1 evoked by the current outcome had shorter latencies when participants decided to pump up balloons (106 ± 1.22 ms), compared with cash out (144 ± 1.23 ms), in the next round. The main effect of round number was not significant, F(1, 31) < 1, \( {\eta}_p^2 \)< 0.001. The interaction of Risk Decision × Round Number was not significant, F(1, 31) < 1, \( {\eta}_p^2 \)= 0.005.

P3 component

For the mean amplitude of the P3, the main effect of risk decision was significant, F(1, 31) = 905, p < .001, \( {\eta}_p^2 \)= 0.967; the P3 evoked by the current outcome showed much larger amplitudes when participants decided to pump up balloons (8.26 ± 0.21 μV) rather than cash out (0.96 ± 0.20 μV) in the next round.

The main effect of risk level was significant, F(1, 31) = 8.80, p = .006, \( {\eta}_p^2 \)= 0.221; the outcome-evoked P3 was larger in Round 2 (5.09 ± 0.24 μV) than in Round 1 (4.13 ± 0.22 μV).

Most interestingly, the interaction of Risk Decision × Risk Level was significant, F(1, 31) = 4.86, p = .035, \( {\eta}_p^2 \)= 0.136 (see Fig. 5). When participants decided to pump up balloons in the upcoming round, the P3 evoked by the outcome of Round 2 (9.01 ± 0.31 μV) was larger than that evoked by the outcome of Round 1 (7.52 ± 0.28 μV), F(1, 31) = 12.0, p = .002. When participants decided to cash out in the next round, the P3 did not show any significant difference between Round 1 (0.75 ± 0.26 μV) and Round 2 (1.17 ± 0.28 μV), F(1, 31) = 1.27, p = .268.

Fig. 5
figure 5

Parietal P3 component (n = 32). Data were averaged at the electrode sites of Cz, CPz, Pz, CP1, and CP2 (blue circles in the topographies). To ensure a relatively low signal-to-noise ratio, individual ERPs that contained less than 15 trials per condition were excluded. Here, ERP time locked to the onset of the outcome presentation in the current decision round is conditionalized by the decision in the next round. (Color figure online)

Single-trial prediction

For the ERP data from the 32 participants, the LR classifier was trained on seven-eighths of the participants (n = 28) and then applied to the remaining participants (n = 4) to discriminate between ERP short intervals preceding risky versus safe choices.

When using the outcome-evoked ERP in Round 1 to predict the decision in Round 2, the AUC of the null hypothesis (H0: no predictive information in the ERP) was calculated in the eight testing sets, resulting in a mean 95% CI of [0.51, 0.57] (see Fig. 6a). For the AUC produced by the LR classifiers, there were three distinctive peaks across the time window of −200 to 1,400 ms (Peak 1 = 94 ± 1.32 ms, AUC = 0.64 ± 0.02; Peak 2 = 158 ± 2.10 ms, AUC = 0.70 ± 0.03; Peak 3 = 370 ± 2.45 ms, AUC = 0.84 ± 0.02; see Fig. 6a and c). Considering that the weight of the LR classifier indicates the predictive contribution of its associated electrode site, the electrode sites that were most contributive to the prediction were PO6, PO8, and O1 for Peak 1; PO8, PO6, and PO3 for Peak 2; and CPz, CP1, and Pz for Peak 3.

Fig. 6
figure 6

Results of the single-trial prediction. LR classifiers were trained using the 62-channel ERPs (60 ms in length) evoked by periodical outcomes to predict the future decisions between risky versus safe choices. For the ERP data from the 32 participants, eight-fold cross-validation resulted in eight testing sets (n = 4 per set). a–b Area under the curve (AUC) produced by the LR classifiers (in red) and the 95% confidential interval (CI) of the null hypothesis (the upper and lower 95% CI boundaries were displayed using black solid and dotted lines, respectively). Light color indicates results of the eight testing sets while the dark color indicates the average. c–d Receiver operating characteristic (ROC) curves for classification by logistic regression on the eight testing sets. ROC results are calculated using four ERP intervals (60 ms; i.e., intervals covering the three distinctive peaks in the AUC waveforms and a baseline interval, −160 to −100 ms). a and c Outcome-evoked ERPs in Round 1 predict the decision (to cash out vs. to continuously inflate the balloon) in Round 2. b and d Outcome-evoked ERPs in Round 2 predict the decision in Round 3. (Color figure online)

When using the outcome-evoked ERP in Round 2 to predict the decision in Round 3, the AUC of the null hypothesis was calculated for the eight testing sets, resulting in a mean 95% CI of [0.46, 0.56] (see Fig. 6b). For the AUC produced by the LR classifiers, there were three distinctive peaks across the time window of −200 to 1,400 ms (Peak 1 = 94 ± 1.55 ms, AUC = 0.62 ± 0.03; Peak 2 = 162 ± 2.36 ms, AUC = 0.74 ± 0.04; Peak 3 = 370 ± 3.78 ms, AUC = 0.90 ± 0.02; see Fig. 6b and d). The electrode sites that were most contributive to the prediction were PO8, PO6, and O1 for Peak 1 and Peak 2; and Pz, CPz, and P1 for Peak 3.

It is worth noting that the AUC peaks shown in Fig. 6a–b occurred at the same time with the peaks of the difference waves between pump and cash out conditions in Figs. 45.

Discussion

Using a modified version of the BART, this study has successfully conducted a within-trial predictive analysis to reveal the relationship between brain activities and upcoming risk decision-making. On a given trial, the P1 and P3 components elicited by periodical outcome in one round could predict behavioral decision (pump/cash out) in the next round. The amplitude of the P3 but not the P1 was sensitive to round number (Round 1/2). The reliability of these findings has been verified with both conventional averaging and single-trial analyses.

Major findings and interpretations

Behavioral results

From Round 1 to Round 3, the frequency of risky decision gradually decreased, indicating that the willingness to pump declined as the reward accumulated and the risk level raised. Meanwhile, the RT for risky decisions increased from Round 1 to Round 3, indicating that choosing the risky option became more difficult for participants. According to Pleskac and Wershbale (2014), the response slowing should not be explained in terms of fatigue or disengagement but rather reflects more planning and deliberate thinking (see also Hassall, Holland, & Krigolson, 2013). Given that the task demand remained constant, the change in RT indicates that the tension between reward and risk became stronger as participants stepped further. Consequently, the escalating risk forced participants to shift from risk seeking to risk avoidance, even though the expected values of two options were the same. This phenomenon is consistent with the classic St. Petersburg paradox that subjective risk preference may change independent of the expected value of options (Weber & Johnson, 2008).

ERP results: Conventional averaging analysis

Regarding the ERP data, the first important finding is that the amplitude and latency of the early P1 component (peaked between 100 and 150 ms after feedback) predicted whether participants would continuously take risk in the next round; that is, both larger amplitude and shorter latency indicated a higher likelihood of making a risky decision. The visual P1 component represents early sensory processing located in the extrastriate cortex and responds to physical stimulus parameters such as luminance (Luck, 2005; Olofsson, Nordin, Sequeira, & Polich, 2008). More importantly, both the amplitude and the latency of the P1 are sensitive to selective attention, such that attended locations elicit a larger P1 than unattended locations do (Clark & Hillyard, 1996), and the P1 latency is shorter for the stimulus that preferentially attracts attention (Olofsson et al., 2008). Taken together, when the periodical outcome engaged more attentional resources (reflected by a larger P1 amplitude and a shorter P1 latency), participants were more likely to make a risky decision than a safe decision in the upcoming round. Seeing that the physical characteristics of positive outcome remained unchanged between risky and safe trials, the P1 difference between these two conditions should not be accounted for by visual features (Weber & Johnson, 2009). Rather, this result may reflect that participants paid more attention to the potential reward associated with a successful pump. That is to say, when participants were more interested in the rewarding aspect of options and pay more attention to it, they would be prone to keep on taking risk (i.e., goal-directed attention allocation). A wide range of addiction research on reward sensitivity supports our idea that risk preference could be strengthened by attentional allocation (Kambeitz et al., 2016; Losel & Schmucker, 2004; Pardini et al., 2015; Verdejo-Garcia, Chong, Stout, Yucel, & London, 2017). For instance, Fridberg et al. (2010) reported that subjective preference to an option varies according to the relative amount of attention paid to gains versus losses, and that cannabis users who exhibited an attentional bias to gains showed increased risk-taking behavior. It is worth noting that the P1 latency varied between conditions through visual detection; therefore, we analyzed the P1 using two separate time windows. Considering that it is difficult to accurately estimate the peak latency of a latent ERP component by observing ERP waveforms, and that differences in latency do not necessarily indicate differences in component timing (Luck, 2005), the result of the P1 latency (and the P1 findings in general) should be interpreted with caution.

The second ERP finding is the P3 component (peaked approximately 350–450 ms after feedback). The P3 amplitude also predicted whether participants would take risk in the forthcoming round (larger amplitudes indicated a higher likelihood of risky choice). Unlike the P1, the P3 showed an interaction between round number and decision, such that the P3 followed by a risky decision showed larger amplitudes across different rounds (see also Hassall et al., 2013). The P3 has been associated with various cognitive functions, depending upon the experimental paradigm being used (Polich, 2007; Polich & Criado, 2006). According to previous literature, the P3 observed in decision-making tasks may indicate the motivational significance of outcome feedback; this component is larger to the outcomes with higher motivational relevance (Martin & Potts, 2009; Nieuwenhuis, Aston-Jones, & Cohen, 2005; Polezzi, Sartori, Rumiati, Vidotto, & Daum, 2010; Schuermann, Endrass, & Kathmann, 2012). The P3 amplitude increased in individuals who attributed more meaning to feedback (De Bruijn, Mars, & Hulstijn, 2004) or showed stronger desire for rewards (Zheng et al., 2010). Accordingly, the predictive effect of the P3 amplitude may reflect that more motivations were allocated to the plan of a risky decision, and that the amount of motivation for risky decision increases with its risk level. We suggest that selective attention (indexed by the P1) was unaffected by the factor of risk, but more motivations (indexed by the P3) were needed to choose the risky option when the risk level increased. When participants plan to make a risky choice, they may feel psychological pressures because of the uncertainty associated with the risky option (Starcke & Brand, 2012). Thus, more motivations are needed to overcome the pressures and push forward the behavioral plan. Furthermore, higher risk level produces stronger pressures to risk-takers, thereby the level of motivation increases accordingly when making risky decisions in high-risk rounds (Schwager & Rothermund, 2013). This interpretation is consistent with the finding that stronger motivations are necessary to choose risky options, even though the expected value of risky options are the same with that of safe ones (Niv et al., 2012; Polezzi et al., 2010).

As an alternative interpretation, the P3 findings may also reflect attentional focus on the risk factor, regarding the close relationship between the P3 and attentional function (Pontifex, Hillman, & Polich, 2009). From the perspective of the context updating theory (Donchin & Coles, 1988; Polich, 2007), the P3 in this study might indicate the updating of external information (which would be involved in subsequent decision-making) in working memory. Taking a step further, it is possible that the P1 and P3 correspond to lower-level and higher-level attentional processes, respectively, such that the P1 encodes the bottom-up attentional engagement of rewarding information while the P3 encodes the top-down directed attentional allocation in which the risk information is taken into account (Debener, Kranczioch, Herrmann, & Engel, 2002; Wu & Zhou, 2009). This interpretation could help explain why only the P3 but not the P1 was sensitive to round number, given that both components were related to attention processes. This “attentional” hypothesis is not necessarily conflicting with the aforementioned “motivational” hypothesis, seeing that motivationally significant stimuli preferentially engage attention (Engelmann & Pessoa, 2007). It is also possible that the P3 manifested an interaction between attention and motivation. For instance, the P3 amplitude might reflect a competition between the attention to the potential risk and the motivation to pursue more rewards. This idea is awaited to be examined in future studies.

ERP results: Single-trial analysis

Aside from the results based on averaged ERPs, this study also used single-trial signatures of the brain to discriminate the upcoming decisions between risky and safe choices. Here, the basic question is when and where the ERP data carry predictive information about the future decision (Pereira et al., 2009). To answer the “when” question, a sliding window (60 ms in length) was used to define ERP intervals along the time axis, resulting in three peaks in the AUC waveform (approximately at the peak latencies of “to pump” P1, “to cash out” P1, and the P3 in the averaged ERPs). Furthermore, the weights of linear classifiers at the three AUC peaks answered the “where” question, which reveal that predictive information mainly came from lateral occipital (the first two peaks) and parietal (the third peak) cortices. The single-trial analysis not only confirmed the robustness of the P1 and P3 findings on averaged ERPs via a data-driven approach but also has the potential to be applied in more practical fields such as consumer psychology (see also Gajewski, Drizinsky, Zülch, & Falkenstein, 2016; Ma, Wang, Wang, Wang, & Wang, 2010).

Theoretical significance

In our opinion, the theoretical significance of this study is twofold.

Risk decision-making perspective

The current findings help to understand how the human brain deals with multiple decision variables (e.g., reward and risk) in the time domain. It has been acknowledged that both reward and risk are encoded in the brain and play important roles in risk decision-making (see the Introduction section). Taking one step further, the current findings describe the temporal relationship of these two factors, indicating that the processing of reward precedes that of risk. That is, the neural signals were insensitive to the risk level of the current scenario until a relatively late time window (approximately 350–450 ms) where the P3 reached its peak. The temporal priority of the reward factor may explain why a high level of reward sensitivity leads to maladaptive risk taking regardless of the perceived risk (Cservenka, Herting, Seghete, Hudson, & Nagel, 2012; Lyvers, Duff, Basch, & Edwards, 2012).

The current findings also have implications for the research on risk preference. Regarding the determinants of risk preference, previous studies have largely focused on involuntary cognitive processes associated with analytical thinking (e.g., Weber & Johnson, 2008). This idea has been well received by the BART literature. For instance, Wallsten et al. (2005) listed probability learning, discrimination, and probabilistic categorization as the major cognitive factors underlying the influence of periodical outcome on subsequent risk preference. The role of voluntary processes in the BART, however, has been acknowledged recently (e.g., Pleskac & Wershbale, 2014). The results of this study indicate that risk preference could be modulated at as early as about 100 ms (i.e., the P1 latency) after feedback presentation, such that there should be little time for analytical thinking. Accordingly, we suggest that the importance of bottom-up voluntary processes in risk preference should be highlighted in future studies.

ERP perspective

The current study is the first one that combines within-trial analysis with the ERP technique. Considering that the round-based BART shows similar task structure with the trial-based simple gambling tasks, one might argue that the within-trial prediction in this study adds no new knowledge beyond the traditional between-trial prediction (e.g., Zhang et al., 2014; Zhang et al., 2013).

We disagree with this opinion because in simple gambling tasks, outcome feedback indicates the ending point of the ongoing decision-making. Accordingly, between-trial prediction reflects the relationship between outcome in one occasion and behavioral output in another occasion. In contrast, different rounds in the BART are not independent; periodical outcome in one round also functions as an option that participants should evaluate in the next round. Accordingly, within-trial prediction reflects the relationship between option assessment and behavioral output in the same occasion.

Indeed, this difference has fundamentally affected the ERP patterns. Specifically, while previous studies using between-trial ERP analysis focused on the FRN that indicates trial-by-trial feedback learning (Cavanagh et al., 2010; Cohen et al., 2011; San Martín et al., 2013), the current study highlights the potential link between the P1 and behavior. The early component P1 has not been considered to play a significant role in decision-making in previous studies. Although the P1 amplitude might be modulated by personality traits (e.g., impulsivity) that affects decision preference (Houston & Stanford, 2001), no study to date has reported that the P1 could predict behavioral choices. In our opinion, this finding has the potential to enrich the ERP literature about decision-making.

Limitations

To sum up, this study has disclosed the temporal dynamics of the brain–behavior relationship during risk decision-making, which showed distinct patterns of neural prediction compared with those based on between-trial studies. However, a few limitations should be noticed.

First, in order to control feedback learning, the information of expected payoff and winning probability was provided to participants. Consequently, the BART changed from decision under ambiguity to decision under risk (Starcke & Brand, 2016). According to Schonberg et al. (2011), the original BART might confound risk preference with learning ability. That is, it is undetermined whether a participant chooses the risky option because of his or her risk preference or because he or she believes that it is the optimal choice. Thus, Schonberg et al. (2011) suggested separating the learning factor from the BART (see also Pleskac, 2008). Nevertheless, one should be cautious when comparing the current results with previous findings based on the original BART, the latter of which might have been modulated by the factor of learning.

A related issue is whether the confounding effect of feedback learning was successfully controlled in our task. One might argue that a learning component cannot be completely excluded even when the probability information is explicitly given (Jessup & O’Doherty, 2011). However, the absence of the FRN effect in the current results might support our idea, regarding its close association with reward learning (Cohen et al., 2011; Walsh & Anderson, 2012).

Second, the RT data of the cash out condition in Round 1 (see Fig. 3b) should be cautiously interpreted, because it was calculated based on very few trials (11.5 ± 4.04 trials per participant), and 13 participants who did not cash out in Round 1 were excluded from the analysis. We admit that it is unclear why some participants had chosen to cash out zero point in Round 1, but response error is a possible explanation.

Third, seeing that the P1 amplitude was insensitive to round number, we hypothesize that the attentional function was independent of the risk factor. Some previous studies, however, suggest the opposite (Karlsson, Loewenstein, & Seppi, 2009). It remains unclear if using a higher level of incentive in the task would create an observable experimental effect. In addition, although we interpret the effect of round number from the perspective of risk level, the reward magnitude of each balloon also increased as a function of round number. Consequently, the potential interaction between risk level and reward magnitude could not be excluded. In this study we dismiss the possibility that the effect of round number on the P3 amplitude reflected variations in reward magnitude rather than risk level, because the P3 was sensitive to round number only when participants made risky decisions. In our opinion, if on-screen reward magnitude indeed modulated the P3, then this component should have also be enlarged as a function of round number when participants made safe decisions because choosing the safe option would guarantee the acquisition of the on-screen reward. However, regarding the limitations of the current task design, we hope follow-up studies using alternative tasks would clarify that this issue does not affect the interpretation of the ERP components in this study.

Finally, this study did not consider individual difference. For example, it has been widely proposed that men and women differ in risk decision-making (de Visser et al., 2010). For another example, personality traits such as impulsivity and sensation seeking also affect BART performance (Lauriola, Panno, Levin, & Lejuez, 2014). All of these limitations indicate that follow-up investigations would be necessary.