The delay-reward heuristic: What do people expect in intertemporal choice tasks?

Recent research has shown that risk and reward are positively correlated in many environments, and that people have internalized this association as a “risk-reward heuristic”: when making choices based on incomplete information, people infer probabilities from payoffs and vice-versa, and these inferences shape their decisions. We extend this work by examining people’s expectations about another fundamental trade-off—that between monetary reward and delay. In 2 experiments (total N = 670), we adapted a paradigm previously used to demonstrate the risk-reward heuristic. We presented participants with intertemporal choice tasks in which either the delayed reward or the length of the delay was obscured. Participants inferred larger rewards for longer stated delays, and longer delays for larger stated rewards; these inferences also predicted people’s willingness to take the delayed option. In exploratory analyses, we found that older participants inferred longer delays and smaller rewards than did younger ones. All of these results replicated in 2 large-scale pre-registered studies with participants from a different population (total N = 2138). Our results suggest that people expect intertemporal choice tasks to offer a trade-off between delay and reward, and differ in their expectations about this trade-off. This “delay-reward heuristic” offers a new perspective on existing models of intertemporal choice and provides new insights into unexplained and systematic individual differences in the willingness to delay gratification.


Introduction
Would you rather receive $10 now or $15 in 3 months' time?Thousands of studies have used this kind of question to investigate the processes by which humans choose between outcomes that differ in when they will occur.The central finding is that, for positive outcomes, deferment renders a positive outcome less attractive -that is, delays lead to discounting of the reward.To avoid dynamic inconsistency (changes in preference between the sooner and later options as they both draw nearer to the present moment), discounting should follow an exponential function: = − , where is the subjective value now of an amount that occurs at time from now, and is a parameter that determines the steepness of the discounting (Samuelson, 1937).However, people's preferences typically deviate from this prescription by showing steeper discounting over short delays than over long ones (e.g., Green & Myerson, 1996); this pattern is often labelled "hyperbolic", but it has been modelled with a wide range of functions, each capturing distinct psychological insights (see Doyle, 2013, for a review).
The present experiments contribute to our understanding of intertemporal choice by examining people's expectations about the options they will encounter in intertemporal choice tasks.For example, when asked: "Would you rather receive $10 now or $15 in. . .", what expectation do decision-makers have about the to-be-revealed delay?And how does that expectation shape their subsequent evaluation of the options?
These questions are motivated by a long line of research showing that people internalize and exploit ecological regularities to make inferences when they have incomplete information (e.g., Brunswick, 1943;Gigerenzer et al., 1999;Skylark, 2018), and that recent experience with attribute values and trade-offs shapes the attractiveness of a given option (e.g., Stewart et al., 2003;Tversky & Simonson, 1993;Rigoli & Dolan, 2019).Most pertinent to the current work is a series of studies by Pleskac and colleagues investigating people's expectations about the trade-off between risk and reward.
In an initial series of studies, Pleskac and Hertwig (2014) found that, across a wide range of real-world domains, larger rewards are associated with smaller probabilities, such that gambles gravitate towards being "fair bets".Pleskac and There are various reasons for this; perhaps the simplest is the operation of competition.For example, a lottery that offers unattractive odds will lose punters to one that offers a better chance of winning, but a lottery that offers an expected return greater than the price of the ticket will go bust.
Hertwig suggest that people have internalized this regularity and use it when confronted with incomplete information about a risky option -i.e., that people employ a "risk-reward heuristic".They contrast this hypothesis with traditional economic theory, in which probability, money, and time are all independent factors (e.g., Savage, 1954), and with the predictions of a "desirability bias" wherein people optimistically rate desirable outcomes as more probable then undesirable ones (e.g., Krizan & Windschitl 2007).Pleskac and Hertwig (2014) tested their hypothesis by presenting a lottery that costs $2 to play and which offers the opportunity to win $2.50, $4, $10, $50, or $100 (with the value varied between participants).As predicted, participants' estimates of the probability of winning were negatively associated with the value of the prizes.Having made their estimates, participants were asked whether they would play the lottery: willingness to play was positively associated with both the stated prize and the inferred probability.Skylark and Prabhu-Naik (2018) subsequently found that people likewise infer missing reward values from stated probabilities; again, the stated and inferred values both predicted subsequent choice.More recent work by Leuker et al. (2018;2019a,b) has directly manipulated the risk-reward trade-off in a learning environment and found that this affects subsequent inferences about, and choices between, risky options for which probability information is not provided.
We ask whether similar principles operate in the domain of intertemporal choice.A positive correlation between time and reward is found in many contexts.One simple justification is given by Stewart et al. (2006) and also follows from Pleskac and Hertwig's (2014) studies of the ecological riskreward relationship: large gains are much rarer than small ones, so the time between large gains must on average be longer.However, it is not clear that people will have internalized this association -and if they have, it is not clear that it will take the form of a single, simple delay-reward function, because in everyday life people will experience a variety of such functions.For example, in games of chance like roulette the expected waiting time for a given outcome is a linear function of the payoff for that outcome, whereas bank accounts often offer returns that grown exponentially via compound interest -although the interest rate may be higher for longer term investments (to compensate for increased risk) or may vary unpredictably.And investments based on stock prices, or other instruments, can show even more complex relationships between time and returns.Thus, we predict that a decision-maker who brings their past experience to bear in a psychological study of inter-temporal A strong form of this heuristic entails the inference that gambles have expected returns equal to the stake; a more general version entails that people infer larger rewards to have lower probability, and vice-versa.
As one illustration, a one-year investment earning the Bank of England's base rate that started in February 2008 would have been earning 5.25% interest p.a. in the first month but only 1.5% p.a. for the last month.
choice will expect a positive association between delay and reward, but given the variety of forms that they are likely to have encountered we remain agnostic about the precise form of this expectation.
The existence of such a "delay-reward heuristic" would be important for several reasons.First, it would support the general claim that humans internalize and exploit ecological structure when making decisions from incomplete information.Second, the beliefs that people hold about the probable delay-reward trade-off may cast new light on efforts to identify the best mathematical description of intertemporal choice data (e.g., McKerchar et al., 2009), because such data would reflect variable past experiences rather than an immutable discounting function.Third, the existence of a delay-reward heuristic could provide new insights into individual differences in delay discounting (e.g., Reimers et al., 2009): variation in environmental context will produce heterogeneous preferences, and if particular demographic or dispositional variables are associated with particular contexts then this could partly or wholly explain the relationship between these individual-difference variables and people's intertemporal choice behaviour.
In four experiments, we adopt the approach of Pleskac and Hertwig (2014) and Skylark and Prabhu-Naik (2018) to investigate how changes in the stated delay affect inferences about the value of a monetary reward and vice-versa.We focus on the kinds of choice between immediate and delayed monetary outcomes with which we began this paper, and frame the task as being "a psychology experiment" because we believe that this is the most common context in which people encounter this kind of simplified decision.

Methods
The four experiments were similar so their Methods are described together.The study materials are available from https://osf.io/e2fnt/

Participants
Studies 1A and 1B recruited US-based participants via Amazon's Mechanical Turk (www.mturk.com);Studies 2A and 2B recruited UK-based participants via Prolific (www.prolific.co).In all studies, eligible participants were those who provided complete data, who were aged 18+, who did not self-report past participation, and whose IP address had not previously occurred in the data file of the current study or earlier in the study series (including in related studies not reported here).Full details of the screening/eligibility requirements are given in Appendix 1.The final samples are described in Table 1.Note: The "Attentive" row indicates the size and proportion of the sample that passed both attention-check questions; "Novice" indicates the size and proportion of sample who indicated that they had not previously taken part in a psychology study involving intertemporal choice between monetary outcomes.

Design and Procedure
Studies 1A and 1B were exploratory; Studies 2A and 2B were pre-registered confirmatory studies (https://aspredicted.org/ mn3vn.pdf).All studies were conducted online.After an initial landing page, information sheet, and consent form, participants were told that they would be asked to consider a simple financial decision.They were told that although the scenario was hypothetical, they should answer as honestly and accurately as they could and that there were no right or wrong answers.

Study 1A
In Study 1A, participants were randomly assigned to 1 of 6 conditions, which differed in the time until the delayed reward.In the "1 day" condition, participants were told: Suppose that you take part in a Psychology experiment in which the experimenter offers you a choice between two financial options.Due to a random printing error, part of one of the options is missing, so you can't see the value that is meant to be displayed.The choice that the experimenter presents you with is shown below, with the missing value replaced by an "X".

Which would you choose?
Option A: Receive $10 now Option B: Receive $X in 1 day's time What do you think the missing value, X, is? Enter a number in the box below.
There followed a text box in which participants entered their judgment (only numeric responses were permitted).Note that they did not make a choice at this point.On the page after the estimation question, participants were told: Suppose that your estimate of the missing value is correct.That is, suppose that the experimenter is offering you a choice between: Participants indicated their choice by selecting between 2 radio buttons labelled "Option A" and "Option B".The 1 week, 2 weeks, 1 month, 6 months, and 1 year conditions were identical to the 1 day condition, except that the corresponding time period was used when stating the delayed option; each participant completed a single condition (i.e., made one estimate followed by one choice).
Finally, participants were asked whether they had previously started or completed the survey, and for their age and gender (male, female, or prefer not to say).After answering these questions, participants proceeded to a debriefing sheet.All questions required a response before the participant could progress to the next page.

Study 1B
Study 1B was identical to Study 1A, but the estimation task specified the delayed amount as either $13, $18, $23, $28, $33, or $38, and the missing value "X" referred to the delay associated with that outcome (e.g., the choice was: "Option A: Receive $10 now.Option B: Receive $13 in X").Participants indicated their estimate of X (which they were told could include a decimal point) via a text box and selected one of 4 radio buttons to indicate the temporal units ("Day(s)", "Week(s)", "Month(s)", or "Year(s)").As in Study 1A, they then proceeded to a screen that asked them to suppose that their estimate was correct and asked them to choose between the smaller-sooner and larger-later rewards, with their estimate of the delay used for the delayed option.

Studies 2A and 2B
Studies 2A and 2B were identical to Studies 1A and 1B, respectively, except as follows.All monetary values were in Sterling (£) and delays were expressed in days: in Study 2A, participants were randomly assigned to one of 8 delays: 1, 4, 12, 27, 63, 88, 122, or 243 days, and estimated the reward; in Study 2B, participants were assigned to one of 8 future rewards: £11, 13, 15, 18, 23, 29, 38, or 54, and estimated the delay in days.Both studies included 2 multiple-choice attention-check questions after participants had made their choice; one question asked the value of the immediate reward (correct answer: £10); the other question asked why one value was originally replaced by an "X" (correct answer: because of a printing error).Participants who answered both questions correctly were labelled "Attentive".Finally, we added a question that probed the participant's past encounters with intertemporal choices studies ["Have you ever previously taken part in a Psychology study in which you were asked to choose between two amounts of money that differ in when they would be given to you (i.e., like this study)?";response options: Yes/No/Don't know].Participants who answered "No" were labelled "Novices".Studies 2A and 2B were run in parallel; participants were randomly assigned to one or the other.

Data treatment
We pre-registered that we would exclude any negative estimates, but there weren't any.All time values were converted to "days" (assuming 365 days per year and 365/12 days per month; in Study 1A, the delay for the "6 month" condition was treated as 365/2 days).For regression analyses, stated delay and stated reward were each divided by 10 (so the coefficients represent the effect of a 10-day or 10-dollar change), and age was divided by 10 and mean-centred (so the coefficients represent the effect of being 10 years older than average).Gender was coded -0.5, 0.5, and 0 for females, males, and "prefer not to say", respectively.

Results
For Studies 1A and 1B, we began with simple tests of whether estimated rewards and delays depend on stated delays and rewards, and of whether estimates and stated values both influence choices.We then conducted more comprehensive exploratory analyses and robustness checks, which we subsequently pre-registered as the analysis plan for Studies 2A and 2B.
For Studies 1A and 1B, -values less than .05were treated as potentially important and we computed 95% confidence intervals; for the pre-registered Studies 2A and 2B, alpha was set to .01 and we report 99% confidence intervals.
A small proportion of participants inferred future rewards that were smaller than the immediately-available amount.Because participants were told there were no right or wrong answers, and because some may have believed they would be asked about a negative trade-off, we report the results with all estimates included in the analyses.

Studies 1A and 1B
In all studies, estimates (inferred rewards and delays) were highly positively skewed but approximately normal after transformation by 10 ( + 1).The normality was only approximate: participants typically selected from a small set of possible values [for example, 6 months (182.5 days) when estimating the delay in Study 1B]. Figure 1 shows, in ascending order, the unique responses in each study; the yaxis shows the proportion of participants in each study who made that response, with data from each condition indicated by different shades of grey.This strong tendency to round numeric values is common in estimation tasks (e.g., Laming, 1997;Matthews & Stewart, 2009;Matthews et al., 2016); we discuss its implications below.

Basic analyses
Table 2 shows descriptive statistics for each study.Figure 2 plots estimates against conditions, with solid points indicating the means.(For this plot, we log-transformed the estimates after adding 1 to deal with zeroes.To improve clarity, we transformed the y-axis tick marks as 10 ; thus, the y-axes are labelled "Estimate + 1".)The right-hand panels show the same data with a logarithmic x-axis, which helps to clarify the effects for low values of the stated delay/reward.The presence of some extreme responses compresses the bulk of the data, so Figure 3 re-plots the means with confidence intervals, making the pattern clearer.The impression from the figures is that inferred delays increased with stated rewards, and vice-versa.Kruskall-Wallis tests confirmed that the stated value of the delay affected estimates of the reward in Study 1A, 2 (5) = 75.08,< .001,and that stated rewards affected estimates of the associated delay in Study 1B 2 (5) = 28.67,< .001.One-way ANOVAs on the log-transformed estimates led to the same conclusions [Study 1A: (5, 327) = 15.56,< .001, 2 = .192;Study 1B: (5, 331) = 5.21, < .001, 2 = .073].
Figure 4 shows the proportion of participants choosing to take the delayed option in each condition.If participants' inferred delays or rewards corresponded to their indifference points (i.e., if they thought they were going to be offered options for which the immediate and delayed options were equally attractive), the data points would fall on a flat line at 0.5.Clearly, this is not the case: in all studies, there is an overall tendency to choose the delayed option.And at the group level, the data in Figure 4 suggest that larger stated delays led to less willingness to wait, implying that the increase in inferred reward was not sufficient to offset the increase in waiting time.Similarly, increases in stated reward led to increased willingness to wait, implying that  the increase in inferred delay was not sufficient to offset the increasing appeal of the potential gain.These effects are further evidenced by Table 3, which shows the results of logistic regression of choice (immediate reward = 0; delayed reward = 1) on condition (stated delays and rewards, treated as continuous variables) and logtransformed estimates (inferred rewards and delays).In both studies, there was a positive association between willingness to wait and the stated or inferred reward, and a negative association between willingness to wait and stated or inferred delay.The effects of estimate remain after controlling for condition: for a given stated delay, people who inferred a larger future reward were more likely to choose to wait; and for a given stated reward, those who inferred a larger delay were less inclined to wait.

The effects of age and gender on inferences
We next conducted exploratory analyses to investigate whether age and gender predict inferences and choices.First, we computed the Kendall's correlation matrices shown in Table 4.In Study 1A, estimated rewards were smaller for men than for women and in Study 1B estimated delays were larger for older participants.These associations are complicated by the fact that, in both studies, age and gender are confounded.
To clarify the picture, we regressed log-transformed estimates on age, gender, and condition.We ran several versions of the analysis to help ensure that our results were not a consequence of particular analytic decisions.In one version, we treated condition as a continuous predictor, as above.However, because the relationship may not be linear, we ran a separate version with condition as a categorical predictor with successive-difference contrast coding, such that the coefficients test the difference between each adjacent pair of stated delays or rewards; in the case of categorical coding, we tested the overall effect of condition by using an -test to compare the model that included condition with one that did not.We ran both the continuous-condition and the categorical condition analyses twice: once with ordinary least squares (OLS) regression and once with robust regression using the lmrob function from the robustbase package for R (Maechler et al., 2020).We report the OLS analysis with condition as a factor in the main text, and note any discrepancies between these results and the alternative analyses -whose results are reported in full in the Supplementary Materials (https://osf.io/e2fnt/).The pattern of results was identical for all other versions of the analyses, except that the tendency for male participants to infer lower rewards than female participants in Study 1A had a 95% CI that excluded zero in the robust regression analysis with condition treated as a continuous variable (b = −0.054[−0.105, −0.003], p = .039).

The effects of age and gender on choice
Finally, we took a similar approach to exploring the effects of demographic variables on decision-making by regressing choice on condition, estimate, age, and gender.Again, we conducted several versions of the analysis, to check robustness: for each study we ran 8 versions formed by factorially varying (a) whether condition was coded as a continuous or categorical predictor, (b) whether estimates were entered as raw responses or log-transformed values, and (c) whether conventional or robust logistic regression was used, with the latter implemented via the glmrob function in the robustbase package for R using the recommended "KS2014" setting.
Table 5 shows the results of the standard regression with condition as a categorical variable and using log-transformed estimates.As expected from the previous analysis, willingness to wait was positively associated with stated and inferred reward, and negatively associated with stated and inferred delay.Neither age nor gender was appreciably associated with choice behaviour.All of the other versions of the regression analyses yielded the same conclusions

Studies 2A and 2B
For Studies 2A and 2B, we pre-registered an analysis plan that comprised all of the analyses from Studies 1A and 1B that incorporated age and gender.We also pre-registered that we would apply all analyses to 3 different versions of the datasets: the full sample; only those participants who passed both attention-checks; and only those attentive participants who also indicated that they had never previously taken part in a monetary intertemporal choice experiment ("attentive novices").We report the results for the full sample using the same regression output as for Studies 1A and 1B, and again note any differences between these and other versions of the analyses (which are reported in full in the Supplementary Materials).In the pre-registration, we predicted that the effects of condition on estimates and of condition and estimates on choice would match those found in Studies 1A and 1B, and that older participants would infer longer delays and smaller rewards than younger participants.As before, descriptive statistics are shown in Table 2; Figure 1 illustrates the tendency of participants to use a small set of round response values; Figures 2 and 3 show the estimates for each condition; and Figure 4 plots the proportion of participants who chose to wait in each condition.The Kendall correlations are again shown in Table 4; the regression results are shown in Table 6.
The results of both studies are very similar to Studies 1A and 1B, and match the predictions.In Study 2A, participants inferred larger rewards for longer stated delays, and younger participants inferred larger rewards than did older ones, with no meaningful effect of gender.Willingness to choose the delayed option was greater for shorter stated delays and for larger inferred rewards, with no appreciable effect of age or gender.The results were the same for all samples and all regression specifications.In Study 2B, participants inferred larger delays from larger stated rewards, and older participants inferred longer delays than did younger ones.These results were the same for all samples and specifications.In the choice data, the willingness to wait was positively associated with stated reward and negatively associated with inferred delay, with no effect of age or gender.The only discrepancies across the 24 versions of the analysis were that in 6 cases the 99% CIs for the effect of inferred delay included zero; in all of these cases the coefficients for the estimated delay were again negative, and the CIs only just graze zero.
Taken together, the results of Studies 2A and 2B replicate the patterns found in Studies 1A and 1B.

Discussion
We found that: (a) participants typically inferred larger monetary rewards from longer stated delays, and vice-versa; (b) estimates of delay and reward were positively skewed and usually took a relatively small number of distinct values; (c) willingness to wait was positively correlated with stated and estimated rewards, and negatively associated with stated and estimated delays; and (d) that older participants inferred longer delays and smaller future rewards than did younger participants, but did not differ in their willingness to wait for the inferred delayed option.We discuss these results and outline future research directions.

Where do the inferences come from?
Our results are consistent with the idea that people approach decisions with prior expectations about the likely trade-off between attributes, and that these expectations shape subsequent decisions (Pleskac & Hertwig, 2014;Tversky & Simonson, 1993).It is important to consider the possible origins of participants' numerical estimates in more detail.One basic distinction is between an associative strategy and a matching strategy.Under the former, our participants encoded past pairings of attribute values and used the provided value of one attribute (e.g., time) to retrieve associated values of the other (e.g., monetary reward).In contrast, magnitudematching involves producing an estimate for the missing attribute that is subjectively-equal to the stated attribute (e.g., responding with a monetary value that "feels the same" as Using the nomenclature Sample.Condition.Estimates.Regression to indicate the sample (full, attentive, or attentive-novice), coding of condition (continuous or factor), coding of estimates (raw or log-transformed), and type of regression (conventional vs robust), the 6 cases were: Full.Note: Inferred values (estimates) were log-transformed as log 10 ( + 1). 2 is adjusted 2 .Pseudo-2 values are Nagelkerke's 2 .For row names in upper table: d = day(s).For completeness, we have included tests of overall model fit in these and subsequent regressions, although these were not part of our pre-registered analysis plan.
the stated delay -for example, by choosing a value that has the same rank position in a contextual set of memory items; e.g., Stewart et al., 2006).Under the associative account, previously-encountered pairings of attributes are critical; in the magnitude-estimation account, only the values within a given attribute dimension are important.The two possibilities could therefore be distinguished by constructing training environments which present the same temporal and monetary values in different pairs (cf Leuker et al., 2018).
A second distinction contrasts the use of purely withinoption information with the use of between-option information.In the case of an associative strategy: people might base their inferences on simple time-money pairings, using the stated reward to "pull out" a likely delay; or they might draw upon the particular combinations of monetary and temporal values that defined the pairs of options in previous choice tasks -a strategy which would permit different inferences about the probable size of a future reward depending on the value and timing of the more immediate option.Likewise, a participant who employs a magnitude-matching approach might simply seek a monetary value that has the same subjective magnitude as the stated delay; or they might seek a monetary value such that the difference between this reward and that of the immediate option matches the difference between the time of the delayed option and "now".In studies like ours, the within-option strategy would mean that people make the same inferences about the missing values irrespective of the value of the immediate reward, a possibility which could readily be tested in future.It is quite possible that different inference/estimation strategies might be used in different contexts.
An alternative approach would be to consider a decisionmaker's reasoning process when confronted with these kinds of choices.Decision-makers may expect larger delays to be associated with larger rewards because that would be necessary to make options with a range of delays equally attractive on average.For example, in a financial product marketplace, rewards would have to be greater for longdelay products in order for them to compete with short-delay products.

Implications for theories and studies of time preference
Mathematical models of delay discounting are usually interpreted in terms of the psychological representation of time and money and their interaction.For example, the popular generalized hyperbolic function (Myerson & Green, 1995) is taken to indicate power-law scaling of money coupled with a focus on rate of reinforcement (Doyle, 2013;Green & Myerson, 1996).Our results suggest that choices reflect the interplay between the monetary and time values compris- We are grateful to Jon Baron for this suggestion.
ing each option and the decision-makers' expectations about those values.In particular, people may make choices on the basis of whether an offered option seems to be good value relative to their expectation of the trade-off between delay and reward.This could lead to substantial reinterpretation of existing functions; it could also lead to alternative models.An obvious step towards the latter would be a mathematical characterization of the "inference functions" that map stated values of time and money into expectations about missing values of money and time.Indeed, we originally hoped that the current studies would provide a first step in this direction; for example, we experimented with fitting common discounting functions to the estimates from Studies 1A and 1B plotted in Figures 2 and 3.However, although it is possible to fit candidate functions to our participants' data, we ultimately decided it would be unwise because the strong tendency of participants to use "round numbers" means that conventional continuous functions are inappropriate.Although the preference for round numbers is widespread (e.g., Laming, 1997;Matthews & Stewart, 2009), the resulting almost-discrete distributions are not well-characterized, so modelling is a challenge.One open question is whether the limited set of monetary and time values produced by our participants reflects a response tendency or a genuine expectation that the options will be familiar, round numbers.The latter possibility might have implications for studies that employ nonround numbers (e.g., Kirby et al., 1999).
Our results also speak to the methodologies used to study intertemporal choice.Some studies use a single pair of options, or employ an adaptive procedure such that presented trade-offs depend on prior choices and are thus idiosyncratic to the participant, but many studies present a substantial fixed set of options.One common approach is to fix one monetary value (e.g., the delayed reward) and offer a set of possible values for the other, and then to repeat this for a range of delays.Two examples are shown in the top 2 panels of Figure 5.Because the same set of rewards is used for all delays, such studies offer steeper delay-reward trade-offs for longer delays than for shorter ones.(Many other papers use the same kind of approach with different specific values; see e.g., Mahalingham et al., 2014;Shamosh et al., 2008).The widely-used "monetary-choice questionnaire" (Kirby et al., 1999;Towe et al., 2015) uses a more diverse mixture of monetary and temporal values, but the options again involve steeper trade-offs when delays are short (bottom panel of Figure 5).Presumably these patterns reflect researchers' intuitions that participants will show hyperbolic-like discounting.From our perspective, people's choices in such tasks will (at least partly) reflect the discrepancies between the lines plotted in Figure 5 and the curve describing participants' expectations about the delay-reward trade-off (Figures 2 and 3).Moreover, we would expect participants to update their expectations in light of the options they have encoun-tered earlier in the session -so the curves shown in Figure 5 might shape, not simply measure, participants' discounting functions (see Stewart et al., 2015, for similar ideas, and Alempaki et al., 2019, Matthews, 2012, for limits on their scope).

Implications for individual differences in intertemporal choice
Our studies suggest that part of the reason people differ in their preferences is because they have different expectations about the delays and rewards that they will encounter -presumably because of different past experiences with the delay-reward structure of their environment.Beyond offering a possible explanation for unexplained variance in choice behaviour (Myerson et al., 2016), this suggests a new perspective on previously-reported associations between intertemporal choice preferences and a variety of demographic and dispositional variables (e.g., Du et al., 2002;Mahalingham et al., 2014;Reimers et al., 2009).In particular, we found that older adults expected smaller future rewards and longer delays than did younger adults.The absolute value of the effect was not large, but to the extent that older people typically have more pessimistic expectations about delayed rewards than do young people, they will be more pleasantly surprised (or less unpleasantly surprised) by any offered "larger-later" option -and hence presumably more likely to choose it.Several studies have indeed found that older people are more likely to choose to wait in the kind of monetary-choice experiment used here (e.g., Green et al., 1994;Jimura et al., 2011;Li et al., 2013;Löckenhoff et al., 2011;Reimers et al., 2009;Whelan & McHugh, 2009), leading to claims that "delay discounting declines across the lifespan" (Odum, 2011, p. 6).However, other studies have found the opposite effect (e.g., Albert & Duffy, 2012) a curvilinear effect of age (e.g., Richter & Mata, 2018), or no association (e.g., Chao et al., 2009;see Löckenhoff & Samanez-Larkin, 2020).Therefore, rather than making a strong claim about the association between age and time-preference, we simply raise the possibility that variables found to predict temporal discounting may do so partly via effects on expectations about the delay-reward trade-off.(It should also be noted that caution is necessary when interpreting age effects found in volunteer samples such as ours.It is possible, for instance, that people volunteer for different reasons at different ages and some other unobserved variable is driving the observed differences.)

Future Directions
Our results suggest several lines of future work, including. . . 1. Retaining ambiguity about the unknown values.In our studies, participants were asked to assume that their estimate of the missing delay or reward was correct, such that the choice task used their estimate to form the delayed option.Resolving the ambiguity in this way helps to clarify the relationship between people's inferences about the missing attributes and their indifference points.However, outside the lab people often have to make choices when the delay or reward remain ambiguous (Dai et al., 2019), and it would be useful to explore the links between stated values, inferred values, and choice behaviour in that kind of situation.This would also permit investigation of order effects: our design necessitated that estimates came before choices, but with ambiguous options task order could be counterbalanced to see whether (for example) the act of choosing affects inference, and vice-versa.
2. Investigating inferences and choices in a within-subject design.We focused on "one-shot" decisions so as to avoid participants constructing inferences based on the local environment of the test session (Pleskac & Hertwig, 2014;Skylark & Prabhu-Naik, 2018) but, as noted above, studies of time preference often try to elicit individual discounting functions by presenting people with a set of choices, and generalizing our approach to this paradigm may be profitable.
3. Testing whether environmental contingencies underlie inferences about missing attributes by manipulating the delay-reward trade-off in a training environment to see whether this directly modulates inferences about missing attributes, and choices when all attributes are present, in a subsequent test stage -as has been done for studies of the risk-reward heuristic (Leuker et al., 2018(Leuker et al., , 2019a,b),b).
4. Generalizing to other contexts.We focused on expectations about the delay-reward trade-off in psychology research.Although researchers are typically interested in "real" decisions, simplified money-time trade-off questions are used as a testbed for developing and testing theories of intertemporal choice, so understanding the role of expectations in this context is important.Nonetheless, it will be necessary to generalize to other scenarios -for example, by asking people to infer the final value of a fixed-term investment fund, or to estimate the price of an "express delivery" service that brings forward the point at which a product can be consumed.
5. Examining whether other individual difference variables predict inferences about missing attributes.Of particular interest are variables such as socio-economic status and the "Big 5" personality traits, which have been associated with patterns of intertemporal choice (Mahalingham et al., 2014;Oshri et al., 2019).Does this reflect different expectations about the delay-reward trade-off?And do differing expectations themselves re-flect different environmental experiences, as might be expected for demographic variables such as income?
6. Examining the consequence of violated expectations.
Our choice task focused on participants' willingness to wait for the delayed option that they had inferred from the stated values.A straightforward extension would be to present options that deviated from the participant's inference.The simple prediction is that participants who inferred/expected longer delays will be more likely to choose to wait than those who inferred short delays -and vice-versa for inferred rewards.
7. Omitting more information.Choices often involve ambiguity about more than one attribute value.For example, both the shorter and longer delay may be unknown.
We can envisage studies that probe expectations about missing values from progressively diminished information, including, in the limit, inferences about the delayreward trade-off when all that is known is that there are two options that differ in when they occur.Such studies would establish more comprehensively the background knowledge that participants bring to time-preference tasks, and how they construct expectations using this knowledge in combination with the information provided in the task.
We have collected preliminary data to explore some of these issues (details are available from the corresponding author), but there is much more to do in future.

Conclusions
These studies indicate the potential value of probing people's expectations about the trade-off between time and money.Despite widespread individual variation, there were reliable tendencies to infer longer delays from larger rewards, and vice-versa, with implications for theories and empirical investigations of delay discounting.In addition, age was somewhat associated with more pessimistic expectations, such that these expectations may partly explain previouslyobserved differences in discounting by older and younger adults.
Option A: receive $10 now Option B: receive [participant's estimate] in 1 day's time Which would you choose?

F 2 :F 3 :
Estimated rewards as a function of stated delays, and estimated delays as a function of rewards.The plot shows 10 ( + 1) against condition, with the y-axis tick marks exponentiated to improve clarity.Values have been jittered to reduce over-plotting.The right-hand panels show the same data with a logarithmic x-axis.Figure 2 re-plotted without the raw data points.Error bars show confidence intervals (95% for Studies 1A and 1B, 99% for Studies 2A and 2B).

F 4 :
Choice of delayed option as a function of stated delay and stated reward.Error bars are Wilson confidence intervals (95% for Studies 1A and 1B; 99% for Studies 2A and 2B).The dotted line indicates indifference.

F 5 :
Trade-offs between money and time in typical experiments of intertemporal choice.
Descriptive statistics for inferred rewards and inferred delays.
F1: Unique responses in each experiment.In all studies, participants tend to use just a handful of response values when inferring rewards or delays, although this is against a background of more idiosyncratic estimates.The numbers below the x-axis label the most popular values, along with the smallest and largest estimate in each study.Note that the x-axis is ordinal: the values are simply arranged from smallest to largest.T 2:Note: Med = Median; LQ = lower quartile; UQ = upper quartile; GeoM = geometric mean, calculated as [10 10 ( +1) ] − 1.
Table 5 shows the results for Studies 1A and 1B.Consistent with the earlier analyses, estimated rewards are positively associated with stated delays (Study 1A), and estimated delays are positively associated with stated rewards (Study 1B).In addition, estimated rewards are negatively associated with age in Study 1A and estimated delays are positively associated with age in Study 1B.The analysis also suggests that, in Study 1B, males inferred longer delays than did females.
5: Regression results for Studies 1A and 1B.Regression results for Studies 2A and 2B.