1 Introduction

The prediction of the standard consumption-savings model, that people always discount an income at the market interest rate, has been found inconsistent with empirical results.Footnote 1 One important anomaly, dating back to Thaler (1981), is the magnitude effect: when comparing a smaller-sooner reward with a larger-later reward, people favor the later reward more often as the amounts of the two rewards are scaled up. Studies on the magnitude effect help us better understand how people make intertemporal choices differently for small and large rewards, for instance, how people allocate inheritance or lottery income over time.

While several experiments have found a magnitude effect, no study has explored the underlying mechanisms.Footnote 2 Intertemporal choices are affected by both the discount rate and the curvature of the atemporal utility function, where the latter determines intertemporal substitutability. Accordingly, the magnitude effect could have two potential causes: people are more patient for larger rewards, and/or people find larger rewards more substitutable across time. The two mechanisms are indistinguishable in a single-reward task where a decision-maker can only receive a reward on a single date (either sooner or later), but they have different implications and lead to different behavioral patterns in more general situations.

One such situation is the intertemporal allocation task, where a decision-maker allocates a fixed budget between two dates, given a (usually positive) return to the share allocated to the later date. Choices in intertemporal allocation tasks can be characterized by two attributes: the average shares allocated to a particular date given a set of return rates, and the responsiveness of these shares to changes in the rate of return. The former attribute is determined by the discount rate and the latter one by the intertemporal substitutability. Thus, studying the magnitude effect in intertemporal allocation tasks can help us identify the mechanism underlying the magnitude effect.

In this paper, we perform a lab experiment to investigate how choices in intertemporal allocation tasks change with the size of the total budget, and in particular, whether the stakes impact intertemporal preferences through patience (the discount rate) or intertemporal substitutability (the atemporal utility function).

Some theories provide an explanation for the magnitude effect in single-reward tasks. Benhabib et al. (2010) posit that a fixed cost of waiting makes people impatient to small outcomes, but it matters less when outcomes are large. Noor (2011) establishes a magnitude-dependent discounting model where the discount rate of a dated outcome is decreasing in the size of the outcome. Fudenberg and Levine (2006) predict that people exert costly self-control when stakes are high but indulge themselves when stakes are low, which generates a magnitude effect. Holden and Quiggin (2017) assume that people take into account more background consumption when experimental rewards are larger, which also explains the magnitude effect in single-reward tasks. When those theories (with proper extension) are applied to intertemporal allocation tasks, Benhabib et al. (2010) and Noor (2011) predict a magnitude effect on the discount rate, while Fudenberg and Levine (2006) and Holden and Quiggin (2017) predict a magnitude effect on the utility curvature. Our experiment helps to further distinguish the explanatory power of these different models.

We employ the Convex Time Budget (CTB) method introduced by Andreoni and Sprenger (2012). It allows subjects to form a portfolio of a sooner reward and a later reward given a budget constraint. The possibility for subjects to make interior choices (and not only corner choices as in single-reward tasks) enables researchers to simultaneously identify the discount rate and the intertemporal substitutability. In particular, the identification of the intertemporal substitutability comes from subjects’ allocation of rewards as a response to changes in the interest rate, which is exactly the definition of the elasticity of intertemporal substitution, and hence the method provides a robust measure against misspecification.Footnote 3

The design of our experiment has three main features. First, all subjects receive equal amounts of participation fees on the sooner date and the later date regardless of their choices, and the payment conditions are constant across time. Thus, the transaction costs and the trustworthiness of the payments are equalized across periods. Second, we implement two treatments. In one treatment subjects allocate between today and 4 weeks later, while in the other treatment subjects allocate between 4 weeks later and 8 weeks later. This allows us to assess whether the magnitude effect is affected by the inclusion of a front-end delay. Finally, by assuming a simple yet popular model, the CTB method allows us to identify the discount rate and the atemporal utility function simultaneously. As a result, we can disentangle the channels of the magnitude effect.

We find evidence of the magnitude effect in intertemporal allocation tasks: the budget share allocated to the later date is increasing in the total budget. The size of the magnitude effect is found to be decreasing in the stake. The pattern is not affected by whether or not a front-end delay is present. At the aggregate level as well as at the individual level, we find magnitude effects both on the discount rate and on intertemporal substitutability. Both channels have considerable impacts on predicted choices. We find that the latter effect is not the same as the magnitude effect on risk attitudes found in previous studies, and hence it might be problematic to correct for the curvature of utility functions by risk attitudes. Instead, the magnitude effect on intertemporal substitutability is consistent with theories proposing that people integrate experimental rewards with more background wealth as the size of rewards gets larger.

The remaining part of this paper is structured as follows: We introduce our experimental design in Sect. 2. In Sect. 3 we formulate our hypotheses. In Sect. 4, we investigate non-parametrically the magnitude effect and its relation with the front-end delay. We explore the channels by parametric estimation both at the aggregate level and at the individual level in Sect. 5. In Sect. 6, we discuss the interpretations of our findings. We conclude in Sect. 7.

2 Experimental design

2.1 The Convex Time Budget method, parameters, and implementation

The basis of our experimental design is the Convex Time Budget method introduced by Andreoni and Sprenger (2012). The method consists of a set of intertemporal allocation tasks. In each decision, subjects are asked to allocate \(N\) tokens to two dates: \(t\) days from today, and \(\left( {t + \tau } \right)\) days from today. Each token allocated to \(t\) is worth \(P_{t}\) euros, while each token allocated to \(\left( {t + \tau } \right)\) is worth \(P_{t + \tau }\) euros. If a subject allocates \(n_{t}\) tokens to the sooner date and \(n_{t + \tau }\) to the later date, the sooner reward is \(z_{t} = P_{t} \cdot n_{t}\) euros and the later reward is \(z_{t + \tau } = P_{t + \tau } \cdot n_{t + \tau }\) euros.

Choices are subject to the budget constraint, \(n_{t} + n_{t + \tau } \le N\), and the non-negativity constraints, \(0 \le n_{t} ,n_{t + \tau } \le N\). Subjects are told that they can allocate any number of tokens they like to one of the two dates. Examples of both corner choices and interior choices are given to remove any hesitation in making either type of choice.

Decisions with the same total budget, \(N\), are grouped in one decision form, which is displayed on one page. There are seven decisions in each decision form. The return to each token allocated to the later date is fixed as \(P_{t + \tau } =\) €0.20, while the return to each token allocated to the sooner date is varied and takes the values \(P_{t} =\) €0.20, €0.19, €0.18, €0.17, €0.16, €0.15, and €0.14. Hence, those returns imply seven gross interest rates, \(R =\) 1, 1.05, 1.11, 1.18, 1.25, 1.33, and 1.43, respectively, over a period of \(\tau\) days. The constraints can be rewritten as

$$R \cdot z_{t} + z_{t + \tau } \le m$$
$$z_{t} ,z_{t + \tau } \ge 0$$

where \(m\) is the total budget and \(m = P_{t + \tau } \cdot N\).

We implement the CTB method by a zTree program (Fischbacher, 2007). Figure 1 shows the interface of a typical decision form. Each decision takes a row. Decisions can be made by scrolling the bars. Once an adjustment is made for one decision, the amounts of the sooner reward and of the later reward in that decision are automatically calculated and displayed.

Fig. 1
figure 1

The interface of a typical decision form in Part I

To avoid any possible effects of initial values, the amounts of rewards are initially blank. Decisions cannot be submitted until all the scroll bars have been adjusted at least once.

2.2 Procedures

There are two parts to our experiment. Part I consists of five decision forms, with \(N =\) 100, 200, 300, 400, and 800, respectively. The order is randomly drawn for each subject. Subjects can move to a specific decision form by clicking the button with the corresponding number. One can go to any decision form at any time, regardless of whether the current decision form is completed. Decisions are automatically stored when one switches to another decision form. This makes comparisons across total budgets very easy to the subjects in case they would want to make such comparisons. Decisions can only be submitted when all the 35 decisions are completed.

We randomly assign subjects to one of two treatment groups. In the Present Group, the sooner date is today while the later date is 4 weeks from today, i.e., \(t = 0\) and \(\tau = 28\). In the Delayed Group, the sooner date is 4 weeks from today while the later date is 8 weeks from today, i.e., \(t = 28\) and \(\tau = 28\). Comparing the two groups enables us to check if there exists a present bias on average and, more importantly, if there exists a magnitude effect when no rewards are available in the present.Footnote 4

We also test the intertemporal independence of preferences over monetary rewards.Footnote 5 One alternative hypothesis is that a subject in the Delayed Group may allocate less to the sooner date if she has allocated a large amount of money to an even sooner date since the desire for extra consumption has already been partly satisfied. A similar hypothesis applies to the Present Group: a subject in the Present Group may allocate less to the later date if she has already allocated a large amount of money to an even later date since the guilt for not saving has been partially released. If preferences are intertemporal dependent, the use of a model with a time-separable preference is likely to be problematic. Thus, we want to test the hypothesis of intertemporal independence before we perform a parametric estimation with a time-separable model.

We use Part II to test the intertemporal independence. It is composed of an extended CTB decision form with seven decisions. Subjects are asked to allocate 400 tokens to three dates, today, 4 weeks from today, and 8 weeks from today. One additional restriction is imposed, depending on which group one is in. A subject in the Present Group can allocate either 0 or 200 tokens to 8 weeks from today; she cannot choose other numbers. But she is still free to allocate any number of tokens between today and 4 weeks from today. Similarly, a subject in the Delayed Group can allocate either 200 or 400 tokens to today. She is still free to allocate any number of tokens (if there remains some) between 4 weeks from today and 8 weeks from today. The restrictions and the returns to one token allocated are shown in Table 1.

Table 1 Restrictions on the number of tokens and returns to one token allocated to a specific date in Part II

The additional date (8 weeks from today for the Present Group or today for the Delayed Group) is accompanied by a very high return for the Present Group and a very low return for the Delayed Group so that subjects are induced to allocate 200 tokens to this additional date. If they do so, the remaining task is equivalent to the one with a total budget of 200 tokens in Part I. This characteristic makes the two decision forms comparable.

We do not directly give a fixed reward on the additional date. This is because a fixed reward might be mentally isolated from the allocation task due to narrow bracketing, and hence the test of intertemporal independence in the allocation task may be invalid.

At the end of the experiment, subjects were asked to finish a questionnaire. As in previous studies with the CTB method, we asked about subjects’ expenditures in a typical week. The average response was €55.22 per week or €7.89 per day.

2.3 Experimental payments

The payments are composed of two parts. First, all subjects receive a €5 participation fee on each of the two dates scheduled in Part I. Second, each subject has a 10% chance to receive earnings from decisions. Before the experiment starts, each subject is randomly given a lottery number, ranging from 0 to 9. After all subjects in a session finish the questionnaire, the experimenter invites one of the subjects to roll a ten-sided die in front of all subjects in the session. Subjects who have a lottery number that equals the die roll get the earnings from decisions. One decision is randomly selected from the 42 decisions in the two parts as the decision that counts. If the decision that counts is from Part I, the allocation in that decision will be realized as the earnings from decisions. If the decision that counts is from Part II, the allocation will be realized and the subject will also receive a €5 participation fee on the additional date in Part II; hence a subject will receive three participation fees if a decision in Part II is realized. All the rules above were articulated in the instructions, and the instructions were always read aloud before either part of the experiment.

The earnings were paid by bank transfer to subjects’ checking accounts. We made orders of transfers soon after the experiment and sent reminder emails with information about the incoming amounts on the day of the experiment and all the payment dates. Given the reliability of the banking service, subjects can expect to receive all delayed payments exactly on the appropriate payment dates, while some of the present payments might be received one day after the experimental day due to the inter-bank processing.

We believe the payment tool we used was as good as cash in terms of liquidity. Checking accounts are used in private transactions such as paying for rents. Checking accounts are also linked to debit cards. In the Netherlands, debit cards are widely used for daily transactions in almost all kinds of stores including supermarkets, university restaurants, and bookstores without any transaction fees. We held a survey about subjects’ use of debit cards in the questionnaire. The responses show that bank transfers give high liquidity to the rewards so that no isolation effect should be expected due to the payment method.Footnote 6

2.4 Transaction costs and credibility of payments

For our experiment, it is extremely important to equalize the transaction costs and the trustworthiness of the payments across periods, because a difference in the transaction costs over the two periods can be a confounding factor of the magnitude effect.

Several facilities were employed to equalize the transaction costs across periods and to increase the credibility of the payments. The transaction costs include the costs to collect rewards, to confirm that the rewards have been received with correct amounts, and to remember the earnings so that they can be consumed on the expected dates.

First, we sent reminder emails with information about the incoming amounts on the day of the experiment and all the payment dates. Subjects knew this from the instructions, so they did not need to worry about forgetting the earnings on the payment dates, a situation in which the expected marginal utility of the delayed rewards might be lowered.

Second, as Andreoni and Sprenger (2012) did, we delivered our business card and told the subjects to contact us immediately in case they would not receive a payment on time. It increased the credibility of payments and meanwhile served as a reminder of the payments.

Third, we asked subjects to fill in a payment reminder card with the amounts of their rewards on the corresponding dates just after their earnings were displayed. This served as a second reminder in case they forget to check emails.

In sum, the characteristics that one will receive a participation fee on each payment date and that all payments will be received by bank transfer help equalize the transaction costs of receiving payments on all dates. At the same time, the business cards, the payments reminder cards, and the reminder emails reduced the risk of forgetting the rewards. The business cards also lowered the perceived default risks. Even though the risk might still be perceived by some subjects, it should be equal across periods since the payment tools and all auxiliary facilities were the same.

2.5 Sample

Our experiment was conducted at the CentERlab, Tilburg University.Footnote 7 203 students of the university participated in one of the 11 sessions, 94 in the Present Group and 109 in the Delayed Group. Each subject made 42 decisions. One session took one hour and ten minutes on average. 22 subjects got the earnings from decisions, which averaged €69.16. The overall average earning was €17.49.

3 Hypotheses

Most previous studies define the magnitude effect in single-reward tasks. Denote a reward \(z_{t}\) on a sooner date \(t\) by \(\left( {z_{t} ,t} \right)\). In a single-reward task, a subject chooses between a sooner reward \(\left( {z_{t} ,t} \right)\) and a later reward \(\left( {z_{t + \tau } ,t + \tau } \right)\), where \(\tau > 0\) is the delay. A subject displays a (positive) magnitude effect if for all \(z_{t} > 0\), \(z_{t + \tau } > 0\) and \(\eta > 1\),

$$\left( {z_{t} ,t} \right)\sim \left( {z_{t + \tau } ,t + \tau } \right) \Rightarrow \left( {\eta z_{t} ,t} \right) \prec \left( {\eta z_{t + \tau } ,t + \tau } \right).$$

In words, the later reward is more favorable if the amounts are scaled up.

We adapt the definition of the magnitude effect to the intertemporal allocation task. A subject makes a choice \(z^{*} \left( {R,m} \right)\) out of a linear budget set \(\left\{ {\left( {z_{t} ,z_{t + \tau } } \right):Rz_{t} + z_{t + \tau } = m} \right\}\), where \(z^{*} = \left( {z_{t}^{*} ,z_{t + \tau }^{*} } \right)\). She displays a (positive) magnitude effect if for all \(m > 0\) and \(\eta > 1\),

$$\frac{{z_{t + \tau }^{*} \left( {R,m} \right)}}{m} \le \frac{{z_{t + \tau }^{*} \left( {R,\eta m} \right)}}{\eta m}$$

and

$$z_{t}^{*} \left( {R,m} \right) > 0 \Rightarrow \frac{{z_{t + \tau }^{*} \left( {R,m} \right)}}{m} < \frac{{z_{t + \tau }^{*} \left( {R,\eta m} \right)}}{\eta m}.$$

In words, people put a larger share of the budget on the later date as the total budget is increased. She may only fail to do so in case the sooner reward is already zero. The adapted definition is consistent with the original one. Options with a larger fraction of later reward become more favorable if all options in the menu are scaled up by the same factor.

Hypothesis 1

(magnitude effect on budget share): \(\frac{{z_{t + \tau }^{*} }}{m }\) is increasing in \(m\).

We are also interested in whether the magnitude effect is affected by the presence of a front-end delay. Benhabib et al. (2010) suggest that a fixed cost of delaying rewards can account for the magnitude effect in single-reward tasks since the fixed cost matters less in case the rewards are scaled up. However, it is not clear if this cost is incurred only when a present reward is delayed or if it applies equally to delaying a future reward. We thus test whether the magnitude effect is smaller or even non-existent if the sooner reward is also in the future.

Hypothesis 2

(a front-end delay leads to a smaller magnitude effect): \(\frac{{z_{t + \tau }^{*} }}{m }\) changes less with \(m\) in the Delayed Group than in the Present Group.

The two hypotheses above can be tested without assuming a specific model.

Conditional on finding a positive magnitude effect, we wish to explore the channels of the magnitude effect. If intertemporal independence is supported by our results (which it will, as we show below), we will estimate the parameters of preferences, with the assumption that subjects maximize a time-separable utility function with power atemporal utility functions and quasi-hyperbolic discounting, i.e., subjects maximize

$$U\left( {z_{t} ,z_{t + \tau } } \right) = \delta^{t} \frac{{\left( {z_{t} + \omega } \right)^{\alpha } - 1}}{\alpha } + \beta \delta^{t + \tau } \frac{{\left( {z_{t + \tau } + \omega } \right)^{\alpha } - 1}}{\alpha },$$
(1)

where \(\beta\) is the present bias parameter, \(\delta\) is the daily discount factor, \(\alpha\) is the exponent parameter. \(z_{t}\) and \(z_{t + \tau }\) are the sooner reward and the later reward, respectively. \(\omega > 0\) is the background consumption mentally integrated with the experimental reward when the decision is made.Footnote 8

When the power utility function is assumed, the elasticity of intertemporal substitution in consumption, \(e_{c} \equiv - \frac{{\ln \left( {\frac{{c_{t + \tau } }}{{c_{t} }}} \right)}}{{\ln \left( {\frac{{u^{\prime}\left( {c_{t + \tau } } \right)}}{{u^{\prime}\left( {c_{t} } \right)}}} \right)}}\), is equal to \(\frac{1}{1 - \alpha }\) (\(c_{t}\) and \(c_{t + \tau }\) are the consumption on the sooner date and on the later date, respectively.). Thus, the exponent parameter, \(\alpha\), is a positive transformation of \(e_{c}\). If \(\alpha \to 1\), the atemporal utility function becomes linear, and the elasticity goes to infinity. In that case, subjects just go for the largest present value, and hence rewards are perfectly substitutable between dates. In case \(\alpha \to - \infty\), the atemporal utility function is Leontief, and the elasticity goes to zero. In that case, subjects always divide the total budget into two equal amounts. In general, the larger the value of \(\alpha\), the more substitutable the subject considers the two rewards to be. Therefore, \(\alpha\) is a measure of intertemporal substitutability.

It brings several advantages to assume such a model. First, the parameters in this model have important economic meanings. The discount factor determines the average choice across interest rates and hence measures the patience of the subject; if a subject is more patient, she will allocate more tokens to the later date for all interest rates. The intertemporal substitutability of consumption between different points in time relates to the dispersion of the choices across interest rates since it measures how sensitive the choices are to the interest rate. These behavioral measures are hard to estimate without assuming a model. Due to the non-negativity constraint, choices are censored at the corners if the preference parameters are extreme. As a result, directly measuring the average choice (as a measure of patience) and the dispersion of choices (as a measure of intertemporal substitutability) leads to biases. In contrast, the model we assume is tractable and easy to estimate. Moreover, the model is widely used in both theoretical and empirical applications.Footnote 9

Given the model above, we test the following two hypotheses.

Hypothesis 3

(magnitude effect on the discount factor): \(\delta\) is increasing in \(m\).

Hypothesis 4

(magnitude effect on intertemporal substitutability): \(\alpha\) is increasing in \(m\).

4 Overall effects

4.1 Magnitude effect on budget share

In our data, 28% of the choices are interior, and 62% of our subjects make at least one interior choice. This is very close to the 30% and 63%, respectively, in Andreoni and Sprenger (2012). The relationships between the budget shares and the interest rates are also similar. A discussion about the rationality of subjects in the CTB task is provided in Appendix A.

In Fig. 2 we plot the mean budget share allocated to the sooner date against the gross interest rate, \(R\), of each CTB decision in Part I. We plot separate points for the five total budgets (\(m =\) €20, €40, €60, €80, €160). The budget share allocated to the sooner date declines with the total budget.

Fig. 2
figure 2

Mean budget share on the sooner date in Part I

The difference seems to be larger when the interest rate is smaller but still positive. This is mainly due to censoring. When the interest rate is zero (\(R = 1\)) or the highest (\(R = 1.43\)), most choices are at the corners for both smaller and larger total budgets.

To judge whether there is a significant magnitude effect, we perform Hotelling’s T-squared tests on the mean differences in budget shares between total budgets, taking seven choices with the same total budget as a vector (see Table 2).Footnote 10 The null hypothesis is that the means of choices are the same across total budgets, taking into account the correlation within-subject. This class of tests makes sense because individual heterogeneity may have made different subjects reveal magnitude effects on tasks with different interest rates (e.g., Subject 1 on Interest Rate 1 while Subject 2 on Interest Rate 2) so that the magnitude effects on all choices would be jointly significant, but the effect on choices with any single interest rate might be insignificant. The results show that the magnitude effect is significant between the total budgets of €20 and €40 and between any two non-adjacent total budgets. These results support Hypothesis 1, which states that a larger share of the budget is allocated to the later date when the size of the budget increases.Footnote 11

Table 2 Multivariate mean difference tests between total budgets

The results also show that the differences are insignificant between adjacent total budgets larger than €20. Since the allocation is monotonic in the total budget and the differences are significant between non-adjacent total budgets, the insignificance suggests that the magnitude effect is the greatest when comparing between the smallest total budgets (€20 and €40) and becomes smaller for larger stakes. The pattern is consistent with the fact that Andersen et al. (2013) found a “statistically significant” but “not economically significant” magnitude effect when they elicited time preferences using very high stakes.Footnote 12

In Online Appendix B, we show group-wise results of the same tests. The results are the same.

4.2 Conditional on the presence of an immediate reward?

We test if the magnitude effect is affected by the existence of a front-end delay. First, Table B.1 shows the results of Hotelling’s T-squared tests on the magnitude effects for the Present Group and the Delayed Group, respectively. We find significant magnitude effects in both groups. This implies that the presence of an immediate reward is not a necessary condition for the magnitude effect.

Second, we plot separate graphs for the two groups in Fig. 3. Subjects in the Delayed Group seem to be slightly more patient than those in the Present Group. However, when we perform Hotelling’s T-squared test on all the 35 decisions in Part I between groups, the null hypothesis that the two groups have the same mean responses is not rejected. The p value is 0.2424 when the degree of freedom is (35, 167). Thus, we find no evidence that the magnitude effect is affected by the existence of a front-end delay, and we reject Hypothesis 2.Footnote 13

Fig. 3
figure 3

Mean budget shares in the present group and the delayed group

This finding has an implication for the modeling of the magnitude effect: no matter what generates the magnitude effect, it applies equally to situations with an immediate reward and those without. For instance, if it is a fixed cost of delaying rewards that generates the magnitude effect, as proposed by Benhabib et al. (2010), the cost applies equally to delaying an immediate reward and to delaying a future reward.

4.3 Intertemporal independence

Our results show that Part II is a valid test of intertemporal independence since most subjects chose 200 tokens for the additional date in Part II. Only nine out of 203 subjects selected a different number than 200 to the additional date, which involved 41 (2.9%) out of 1415 decisions.

After removing those decisions, we compare the choices with the total budget of €40 between Part I and Part II, separately for each group. Table 3 shows that Hotelling’s T-squared tests fail to reject the null hypothesis that responses to the two parts have the same means. Those results support intertemporal independence, which will be assumed in the next section.Footnote 14

Table 3 Multivariate mean difference tests between Part I and Part II

5 Channels

To disentangle the magnitude effect into two channels, we perform parametric estimations both at the aggregate level and at the individual level. We then test if the preference parameters change with the size of the total budget.

5.1 Aggregate-level estimation

5.1.1 Estimation strategy

In our main specification, we assume a time separable utility function with power atemporal utility functions as in Eq. (1). We set \(\omega\) (background consumption) equal to the average response to the question about one’s typical daily expenditure, €7.89, as Andreoni and Sprenger (2012) did in two of their specifications.Footnote 15

Given the intertemporal utility function, solving the optimization problem yields the tangency condition

$$\frac{{z_{t} + \omega }}{{z_{t + \tau } + \omega }} = \left\{ {\begin{array}{*{20}c} {\left( {\beta \delta^{\tau } R} \right)^{{\frac{1}{\alpha - 1}}} ,} & {{\text{if }}t = 0} \\ {\left( {\delta^{\tau } R} \right)^{{\frac{1}{\alpha - 1}}} ,} & {{\text{if }}t > 0} \\ \end{array} } \right..$$

Taking logs gives a linear equation

$$\ln \left( {\frac{{z_{t} + \omega }}{{z_{t + \tau } + \omega }}} \right) = \left( {\frac{\ln \beta }{{\alpha - 1}}} \right) \cdot 1_{t = 0} + \left( {\frac{{\ln \delta^{\tau } }}{\alpha - 1}} \right) + \left( {\frac{1}{\alpha - 1}} \right) \cdot \ln R$$

where \(1_{t = 0}\) is the indicator for the Present Group.

The parameters to be estimated are the present bias parameter, \(\beta\), the discount factor, \(\delta\), and the power curvature parameter, \(\alpha\). The present bias parameter is identified by the differences in allocation between the Present Group and the Delayed Group. If there is a present bias, subjects in the Present Group will allocate more tokens to the sooner date than those in the Delayed Group. The discount factor is identified by one’s average choice across different experimental interest rates. A more patient subject will allocate more tokens to the later date in all decisions. The curvature parameter is identified by the dispersion of one’s choices across interest rates. Those who consider rewards highly substitutable over time are likely to make corner choices in all decisions, while those with lower elasticity of intertemporal substitution will make choices closer to equal splits.

Following the practice in previous studies (Andreoni & Sprenger, 2012, and Augenblick et al., 2015), we assume a normally distributed error term additive to the log-consumption ratio and consider censoring, to yield the two-limit Tobit model:

$$\begin{aligned} l_{i,j,k}^{*} & \equiv \ln \left( {\frac{{z_{t;i,j,k}^{*} + \omega }}{{z_{t + \tau ;i,j,k}^{*} + \omega }}} \right) \\ & = \left( {\frac{\ln \beta }{{\alpha - 1}}} \right) \cdot 1_{t = 0;i} + \left( {\frac{{\ln \delta^{\tau } }}{\alpha - 1}} \right) + \left( {\frac{1}{\alpha - 1}} \right)\ln R_{j} + \varepsilon_{i,j,k} , \varepsilon_{i,j,k} \;\sim \;N\left( {0,\sigma_{k} } \right) \\ \end{aligned}$$
$$l_{i,j,k} = \left\{ {\begin{array}{*{20}l} {\ln \frac{\omega }{{m_{k} + \omega }},} \hfill & {{\text{if}}\quad l_{i,j,k}^{*} \le \ln \frac{\omega }{{m_{k} + \omega }}} \hfill \\ {l_{i,j,k}^{*} ,} \hfill & {{\text{if}}\quad \ln \frac{\omega }{{m_{k} + \omega }} < l_{i,j,k}^{*} < \ln \frac{{\frac{{m_{k} }}{{R_{j} }} + \omega }}{\omega }} \hfill \\ {\ln \frac{{\frac{{m_{k} }}{{R_{j} }} + \omega }}{\omega },} \hfill & {{\text{if}}\quad l_{i,j,k}^{*} \ge \ln \frac{{\frac{{m_{k} }}{{R_{j} }} + \omega }}{\omega }} \hfill \\ \end{array} } \right.$$

where \(i = 1, \ldots ,203\) denotes Subject \(i\), \(j = 1, \ldots ,7\) denotes Interest rate \(j\), and k=1,...,5 denotes Total budget k. The error term is allowed to vary across total budgets since giving a larger number of tokens might induce a larger noise, which might be a competing explanation of a larger sensitivity to the interest rate.

The Tobit model can predict corner choices with a natural interpretation. When the background consumption \(\omega\) is positive, the marginal utility at a zero reward is finite. If the implied interest rate is much higher than the discount rate, the model predicts a latent choice with a negative budget on the sooner date. The individual would be willing to give up part of her background wealth on the sooner date to earn a larger amount on the later date. But in the experiment, she is not allowed to do that. This is naturally captured by the Tobit model by censoring at the later corner. The opposite case occurs when the implied interest rate is much lower than the discount rate. Then the individual would be willing to borrow money from the experimenter if she could, but her choice is censored at the sooner corner.Footnote 16

The model is estimated by the quasi-maximum-likelihood method: when performing the estimation, the error term, \(\varepsilon\), is assumed to be i.i.d., while in computing the standard errors, the error term is assumed to be independent across subjects, but might be correlated within-subject. Estimates of the parameters can be recovered and standard errors can be inferred by the delta method.

Since we are interested in the magnitude effect, we also perform the estimation with interaction terms of the parameters and the budget dummies. Thus, tests can be performed on the differences between the parameters for different total budgets.

To see why the measure of the intertemporal substitutability is robust against misspecification, notice that \(\alpha\) is identified from the sensitivity of the log-ratio of the consumption (i.e., \(\ln \left( {\frac{{z_{t;i,j,k}^{*} + \omega }}{{z_{t + \tau ;i,j,k}^{*} + \omega }}} \right)\)) to the logarithm of the gross interest rate (i.e., \(\ln R_{j}\)). This is approximately the sensitivity of the percentage change in the ratio of consumption to the percentage change in the relative price, which is exactly the definition of the elasticity of intertemporal substitution. Therefore, even if the utility function is misspecified, \(\alpha\) is still a measure of intertemporal substitutability.

In Online Appendix D, we assume a more flexible specification, in which the utility function has one additional free parameter. There the background consumption, \(\omega\), is also a parameter to be estimated. In this way, we address the concern that the average self-reported background consumption may not match the true background consumption integrated with the experimental rewards in decision making, or the elasticity of intertemporal substitution of the utility function may not be constant (i.e., the power utility function with a fixed consumption shift is misspecified). The results are basically the same.

5.1.2 Results

Table 4 reports the magnitude-invariant estimates and the magnitude-specific estimates of the parameters, respectively. A salient feature is that none of the estimates of \(\beta\) is significantly different from 1, implying no evidence of present bias, which is consistent with our finding in the model-free analysis. The annual discount rate for all the budgets together is 52.7%, which is in the range found by previous studies. The curvature parameters are always significantly smaller than 1, implying that the subjects on average consider the monetary rewards received on different dates imperfectly substitutable, which is also consistent with other studies (e.g. Andreoni & Sprenger, 2012; Andreoni et al., 2015; Augenblick et al., 2015; Cheung, 2020).

Table 4 Discounting and curvature parameter estimates in the aggregate-level estimation

Most importantly, both the discount factor and the curvature parameter are increasing in the stake. To judge if these magnitude effects are significant: Table 5 presents Wald tests over the differences of parameters between total budgets. We find significant magnitude effects both on the discount factor, \(\delta\), and on the exponent parameter, \(\alpha\), which is a positive transformation of the elasticity of intertemporal substitution. The discount factor is increasing in the total budget, meaning that the decision weights on later rewards shift upward when the total budget increases. The elasticity of intertemporal substitution is increasing in the total budget, meaning that the rewards on the two dates are more substitutable to the subjects when a larger total budget is provided. This results in choices closer to the two corners (to which corner depends on whether \(\delta R > 1\)). Thereby, we verify Hypothesis 3 and Hypothesis 4.

Table 5 Estimates of parameter differences between total budgets in the aggregate-level estimation

To get an idea about the size of the magnitude effects, we compare the discount rates and the predicted monetary discount rates, respectively, between stakes. The continuous annual discount rate at stakes of €20, €40, €60, €80, and €160 are 0.696, 0.519, 0.384, 0.370, and 0.237, respectively. Thus, the discount rate is 1.3 to 1.6 times larger when the stake is halved. This is larger than the effects found by Andersen et al. (2013). Their discount rate at a stake of 1,500 Danish kroner is only 1.0 to 1.1 times larger than that at a stake of 3,000 Danish kroner. As we mentioned earlier, the difference is consistent with the pattern that the magnitude effect becomes smaller at higher stakes. To incorporate the magnitude effect on the utility curvature, we predict the monetary discount rates at the lowest and the highest stakes. Take the monetary discount rate at the stake of €20 as an example, it is the continuous annual discount rate implied by an indifference relation between a later reward of €20 and an equally good sooner reward, assuming linear utility. We find the monetary discount rate at €20 is 1.058 and that at €160 is 0.257. The former is 4.1 times larger than the latter. Therefore, the two channels of the magnitude effect between the stakes of €20 and €160 are both large. Halevy (2015), for instance, finds that the monetary discount rates measured with a budget of $10 are 1.4 to 1.9 times larger than those with a budget of $100.

To illustrate the relative importance of the two channels of the magnitude effect, we use the estimates above to predict choices in the 35 questions for both the Present Group and the Delayed Group. Table 6 presents the marginal effects of allowing one parameter to vary with the total budget: in each row, we allow only one parameter, either \(\delta\) or \(\alpha\), to vary with the total budget of the decisions (as indicated by the column title), but fix the other two parameters at the value estimated from the budget of €20. Each number in a cell is the total change (in units of \(\frac{{N}_{k}}{100}\), the percentage of the total budget) in the seven decisions with the corresponding total budget. The results show that the marginal effect of allowing \(\alpha\) to vary with the total budget is at least as large as the marginal effect of allowing \(\delta\) to vary. This suggests that the magnitude effect on the elasticity of intertemporal substitution is at least as important as the magnitude effect on the discount rate.

Table 6 Marginal effects of allowing a parameter to vary with total budgets in the aggregate-level estimation

5.2 Individual-level estimation

The aggregate-level estimation provides evidence of positive magnitude effects on the discount factor and intertemporal substitutability. One may wonder whether these results are purely a compositional effect: a bias resulting from forcing all subjects to have the same preferences and the same distribution of noise. To deal with this concern, we also perform individual-level estimation and tests.

5.2.1 Estimation and testing procedure

We keep all the assumptions that underlie Eq. (1) except for \(\beta\) since it is not identified in individual-level estimations. We estimate the discount factor (\(\delta\)) and the intertemporal substitutability (\(\alpha\)) for each combination of subject and stake, and then test if the two parameters are increasing in the magnitude within-subject.

One important difference from the aggregate-level estimation is that under-identification occurs when a subject made no or only one interior choice at one stake. There are 627 out of 1015 (62%) subject-stake combinations for which this is the case. We thereby adopt a conservative way to test the magnitude effect. First, we obtain point estimates of \(\delta\) and \(\alpha\) if possible by estimating the Tobit model specified in Sect. 5.1. Whenever there is an under-identification problem, we remove the error term from the model and then infer the intervals of \(\delta\) and \(\alpha\) that can generate the observations. Second, we perform a one-sided sign test on the two parameters, respectively, with the null hypothesis that they do not change with the magnitude. The sign test is flexible in that it does not impose assumptions on the distributions of idiosyncratic shocks to the parameters. For a comparison between a point estimate and an interval estimate, we recognize a difference only if the point is not in the interior of the interval. For a comparison between two interval estimates, we recognize a difference if the two intervals do not overlap.Footnote 17

We also categorize the subjects based on how their preference parameters change with the stake. For each parameter, a subject presents values for five stakes, allowing for 10 pairwise comparisons across stakes. We use the following two criteria for our categorization: first, we define subjects who present no magnitude effect as those who have “unchanged” results in all the 10 comparisons across stakes, subjects who present positive magnitude effects as those who have no “decrease” and at least one “increase”, and subjects who present negative magnitude effects as those who have no “increase” and at least one “decrease”, and call the rest as subjects who present a mixed magnitude effect. Arguably this criterion is too stringent in that it does not allow for any error. Our second criterion allows for error: we define subjects who present positive magnitude effects as those who have more “increases” than “decreases” and subjects who present negative magnitude effects as those who have more “decreases” than “increases”.

5.2.2 Results

Table 7 shows the results of the tests at the individual level. We reject the null hypotheses of no magnitude effect on the two parameters, in favor of positive magnitude effects. This shows that the two channels of the magnitude effect on intertemporal choices are robust against individual heterogeneity.

Table 7 Sign tests on preference parameters between total budgets

Table 8 shows the results of subject categorization. Under both criteria, subjects who present positive magnitude effects are much more frequent than those who present negative magnitude effects. This is consistent with our main finding that positive magnitude effects exist.Footnote 18

Table 8 Categorization of subjects

6 Interpretations

The results above imply that when an average subject faces a larger budget in an intertemporal allocation task, she behaves more patiently, but also she regards rewards to be more substitutable between dates.

6.1 Relation with the magnitude effect on risk aversion

According to the Discounted Expected Utility (DEU) theory, when the atemporal utility function is a power function, the risk attitude and the elasticity of intertemporal substitution are represented by the same parameters, since risk aversion and imperfect fungibility both originate from diminishing marginal utility. Therefore, one may wonder whether the magnitude effect on intertemporal substitutability is the same as the magnitude effect on risk attitudes.

We find evidence against this equivalence when we compare our results with previous findings. Studies using risky prospects find that the Arrow–Pratt measure of relative risk aversion (Pratt, 1964) is larger when outcomes are scaled up (Binswanger, 1981; Kachelmeier & Shehata, 1992; Holt & Laury, 2002; Harrison et al., 2005; Fehr-Duda et al., 2010; Bouchouicha & Vieider, 2017). This is in the direction that is opposite to what we find. Increasing relative risk aversion suggests an increase in the concavity of a power utility function as the stake increases while our results show a movement towards linearity. This contradiction suggests that the magnitude effect on relative risk aversion is not driving the magnitude effect on intertemporal substitutability.

Some other studies also suggest that risk aversion and intertemporal substitutability should be separated. Andreoni and Spregner (2012) found no significant correlation at the individual level between the curvature estimated by the CTB method and the risk attitude elicited by the MPL method. Abdellaoui et al. (2013b), Miao and Zhong (2015), Cheung (2015). also found that the utility curvature elicited from intertemporal tasks is quantitatively different from that elicited from tasks with risk. We provide evidence from a different perspective: while the previous studies showed that the degrees of concavity are different for the two kinds of utility functions, we show that the degrees of concavity change in opposite directions when the stake is varied.

This finding has implications for both theories and experimental methods. First, it lends support to the theories which separate intertemporal substitutability from risk aversion (Kihlstrom & Mirman, 1974; Richard, 1975; Kreps & Porteus, 1978; Epstein & Zin, 1989; Weil, 1990; Bommier, 2007; Ebert and van de Kuilen 2016, etc.) Second, it casts doubt on the use of a risk-elicitation task to correct for the curvature when eliciting time preferences.

6.2 Relation with borrowing constraints

In theory, a binding borrowing constraint can lead to a magnitude effect on the monetary discount rate in a single-reward task if the background consumption is expected to grow over time, as shown by Epper (2015). However, Meier and Sprenger (2010) found that experimentally elicited long-run discount rates are uncorrelated with credit constraints, suggesting that on average, whether the borrowing constraint is binding does not affect intertemporal choices in experiments where all outcomes are in the future.

Moreover, given the fact that subjects may have savings that provide some but small liquidity, the number of subjects whose borrowing constraints are binding should be increasing in the stake. For this reason, if the borrowing constraint is the main issue, we should observe that the intertemporal substitutability is decreasing in the stake, which is inconsistent with our results. Therefore, we believe that a binding borrowing constraint is not the main driver of our results.

6.3 Relation with existing theories

We discuss the implications of our empirical findings for some theories that may explain the two channels of the magnitude effect.

One model that can account for the magnitude effect on the discount factor was proposed by Benhabib et al. (2010). They developed a model with a fixed cost of delaying rewards. The idea is that whenever a delayed reward is chosen, a fixed cost is incurred so that as the stake increases, the cost becomes relatively less important and hence the subject appears more patient.

Noor (2011) proposed a model of magnitude-dependent discounting, which leads to similar predictions. In his model, the discount function is increasing in the utility at the later period. As the stake gets larger, the discount function converges to 1.Footnote 19

One theory that can explain the magnitude effect on intertemporal substitutability is an extended version of the dual-self bank-nightclub model of Fudenberg and Levine (2006). In the original model, the agent first chooses the amount of pocket cash when no temptation is present, and then she chooses the amount of consumption when a windfall is available and temptation plays a role. The strategy for utility maximization is to spend the complete windfall when it is small but try to save some money out of the windfall when it is large. A small windfall is not integrated with the lifelong wealth, because the agent does not bother to perform self-control, but it is worth controlling oneself when the windfall is large. As a result, the utility function for windfalls is much more concave when the size of the windfall is below a certain threshold than when it is above the threshold.

The model can explain a magnitude effect on intertemporal substitutability if we impose the assumption that an agent who anticipates a reward in the future does not immediately adjust her cash allocation plan. Instead, she keeps the anticipated reward in the mental account of windfalls until it is received and part of it is consumed. Only after the remainder is moved into the mental account of savings does she reschedule her future consumption.

When this assumption is used, what the model predicts in intertemporal allocation tasks is consistent with our results. A subject is likely to make interior choices when the budget is small, i.e., below the threshold induced by the self-control costs. Since the utility function for windfalls is very concave, the subject balances extra consumption on the sooner date and the later date. As the budget increases above the threshold, the subject likes to save part of it for consumption smoothing. Since the utility function for savings is much less concave (close to linear), these savings will be allocated fully to either the sooner date (when the interest rate is small) or the later date (when the interest rate is large). Hence, as the budget increases the intertemporal substitutability increases and it will appear as if the utility function has become less concave (see Online Appendix E for a simulation).

Another model that can explain the magnitude effect on intertemporal substitutability is the mental zooming theory proposed by Holden and Quiggin (2017). The theory presumes that people integrate more background consumption with the experimental reward as the size of the reward increases. If the budget increases, individuals ‘zoom out’ as it were and take a broader perspective on the decision problem. One reason may be that individuals are likely to divide and use up a bigger windfall over a longer period. Based on the data collected from their field experiment with Malawian peasants, Holden and Quiggin showed that the magnitude effect on the discount rate in single-reward tasks would disappear if the unobserved background consumption is assumed to be an increasing function of the stake.

In intertemporal allocation tasks, the increasing background consumption can generate a magnitude effect on intertemporal substitutability. To see why, denote the observed elasticity of intertemporal substitution in experimental rewards by \(e_{z}\). The relationship between \(e_{z}\) and preference parameters is

$$e_{z} = \frac{1}{1 - \alpha } \cdot \frac{{\log \left( {\frac{{z_{t + \tau } }}{{z_{t} }}} \right)}}{{\log \left( {\frac{{z_{t + \tau } + \omega }}{{z_{t} + \omega }}} \right)}}.$$

Since \(e_{z}\) is increasing in both \(\alpha\) and \(\omega\), an increase in \(\alpha\) and an increase in \(\omega\) are competing explanations for the magnitude effect on intertemporal substitutability. If subjects take into account more background consumption as the total budget increases, we would observe a greater sensitivity to the interest rate, i.e., a greater \(e_{z}\). When we assume a fixed background consumption, however, the pattern will be attributed to a magnitude effect on \(\alpha\).

Both the mental-accounting Fudenberg-Levine model and the mental zooming theory point to partial integration with lifelong wealth, which seems to be an important mechanism of the magnitude effect on intertemporal substitutability. Andersen et al. (2018) showed empirically that subjects only partially integrate experimental rewards with wealth in risk preference tasks. While they provide evidence of partial asset integration by exploiting variation in personal wealth, we provide within-subject evidence suggesting that the degree of asset integration is increasing in the stake.

None of the current models can explain both a magnitude effect on the discount factor and a magnitude effect on the intertemporal substitutability. Of course, the two channels can be explained by a mode-switching model in which individuals are assumed to have different preferences for different stakes. However, a truly unified explanation is still lacking.

7 Conclusion

Our study investigates the magnitude effect on intertemporal choices in a setting that is more general than a single-reward task, namely the intertemporal allocation task. After adapting the definition of the magnitude effect to the new task, we verify its existence: people allocate a larger share of the budget to the later reward as the total budget increases. The magnitude effect is not affected by whether the sooner reward is immediate or in the future. The size of the magnitude effect is smaller when the stakes are higher.

We then look deeper into the effect, by exploring the channels. The results underscore the importance of a dimension that is often overlooked, namely, the intertemporal substitutability. We find evidence that both discount factor and intertemporal substitutability change with the magnitude of rewards. The magnitude effect on intertemporal substitutability is not driven by the magnitude effect on risk attitudes, since the two magnitude effects are in the opposite directions.

Some theories may provide explanations for one of the two channels. A cost-of-delay model (Benhabib et al., 2010) or a magnitude-dependent discounting model (Noor, 2011; Baucells & Heukamp, 2012) can account for a magnitude effect on the discount factor. Models which allow the degree of asset integration (mental accounting) to vary with the size of the budget can explain a magnitude effect on intertemporal substitutability. However, a new theory would be needed to account for both channels simultaneously and in a unified way.