Taking the New Year’s Resolution Test seriously: Eliciting individuals’ judgements about self-control and spontaneity

: Self-control failure occurs when an individual experiences a conflict between immediate desires and longer-term goals, recognises psychological forces that hinder goal-directed action, tries to resist them but fails in the attempt. Behavioural economists often invoke assumptions about self-control failure to justify proposals for policy interventions. These arguments require workable methods for eliciting individuals’ goals and for verifying occurrences of self-control failure, but developing such methods confronts two problems. First, it is not clear that individuals’ goals are context-independent. Second, facing an actual conflict between a desire and a self-acknowledged goal, a person may consciously choose not to resist the desire, thinking that spontaneity is more important than self-control. We address these issues through an online survey that elicited individuals’ self-reported judgements about the relative importance of self-control and spontaneity in conflicts between enjoyment and health-related goals. To test for context-sensitivity, the judgement-elicitation questions were preceded by a memory-recall task which directed participants’ attention either to the enjoyment of acting on desires or to the satisfaction of achieving goals. We found little evidence of context-sensitivity. In both treatments, however, judgements that favoured spontaneity were expressed with roughly the same frequency and strength as judgments that favoured self-control.

In offering guidance about public policy, behavioural economists often use the concept of self-control failure to support claims that, in specific contexts, individuals' actual choices are contrary to their best interests, as judged by those individuals themselves. Policy interventions are proposed with the aim of helping people to reach goals that they acknowledge as theirs, but which forces within their own psychology hinder them from achieving. 1 An individual experiences self-control failure if he or she is aware of such forces, tries to resist them, but fails in the attempt. Policies that are designed to counter such failures can be defended as non-paternalistic on the grounds that the policy-maker's intention is to satisfy individuals' self-acknowledged goals. 2 If this kind of justification is to be used to guide public policy, behavioural economics needs workable methods for eliciting individuals' goals and for verifying occurrences of selfcontrol failure. Surprisingly little attention has been given to the design of such methods. Potential questions about people's goals are often treated as having self-evident answers, as when Thaler and Sunstein (2008: 73), advocating nudging people towards healthier lifestyles, introduce the New Year's Resolution Test: '[H]ow many people vow to smoke more cigarettes, drink more martinis, or have more chocolate donuts in the morning next year?' For Thaler and Sunstein, this question is merely rhetorical, but the premise of our paper is that the actual content of people's self-acknowledged goals, and the normative significance that people actually attach to those goals, are research topics that require real answers.
It is uncontroversial that many people profess long-term goals from which they sometimes deviate when facing more immediate desires. However, recognition of such inconsistencies leaves open the question of how they should be interpreted from a normative point of view, or in the formation of public policy. Advocates of interventions to counter self-control failures often appeal to dual-self interpretations. For example, Thaler and Sunstein (2008: 41-42) represent self-control problems in terms of a person's two selves, a far-sighted 'Planner' and a myopic 'Doer'. Describing the roles of these two selves, Thaler and Sunstein say: 'The Planner is trying to promote your long-term welfare but must cope with the feelings, mischief, and strong will of the Doer, who is exposed to the temptations that come with arousal'. There is an implicit assumption that the Planner's judgements about goals retain their normative authority -they represent what the person 'truly' values -even when the actual human being is consciously choosing to ignore them. On this view, public policy should help the Planner to maintain control when the Doer is trying to rebel. 3 But if policymakers want to respect individuals' own judgements, they need to ask how far, and how consistently, real human beings identify with their Planners rather than with their Doers. Our aim in this paper is to place these questions on the agenda of behavioural public policy and to take some preliminary steps towards answering them.
We start from the idea that self-control -constraining yourself to act on long-term goals -can be contrasted with spontaneity -leaving yourself scope to respond to situations as they arise, rather than following pre-scripted plans. As an informal illustration of this contrast, consider a different New Year's Resolution Test, in which individuals are asked to rate the importance they attach to keeping their New Year's resolutions. A high rating might be interpreted as a personal judgement in favour of self-control relative to spontaneity; a low rating might be interpreted as the opposite. Normatively, neither of these attitudes seems self-evidently superior to the other; one might associate self-control with the 'Big Five' trait of conscientiousness, and spontaneity with openness to experience. Thus, acting contrary to a professed goal should not be treated as sufficient evidence of a failed attempt to exercise selfcontrol or of a desire to be helped: as viewed by the actor, it might be a conscious expression of spontaneity.
In this paper, we report an online survey which investigated the relative importance that individuals attach to self-control and spontaneity in relation to conflicts between immediate enjoyment and long-term health goals. We focus on three issues that need to be faced if public policies are to be justified as non-paternalistic responses to self-control failures.
First, do people generally judge self-control to be more important than spontaneity? That this is the case is an implicit assumption of many behavioural policy recommendations, but might turn out to be false.
Second, how homogeneous are people's judgements about these attitudes? Even if the judgements of a majority of people favoured self-control, policymakers might want to take some account of contrary judgements held by significant minorities.
Third, are those judgements context-independent? Much of the evidence that is cited in support of the hypothesis of self-control failure is about the context-dependence of revealed preferences -the fact that preferences between given consumption possibilities differ systematically according to the presence or absence of cues that draw attention to particular temptations. It is also known that when people report their overall satisfaction with their lives, the implicit weights they give to different aspects of life can depend on what is currently the focus of their attention (Schkade and Kahneman, 1998) -an effect summed up in Kahneman's (2011: 402) 'fortune cookie' maxim: 'Nothing in life is as important as you think it is when you are thinking about it'. If preferences and life-satisfaction judgements are often context-dependent, the same might be true of people's attitudes to self-control and spontaneity.
Since our respondents were not recruited as a representative sample, our survey must be treated as an exploratory study. Our main findings were as follows. Overall, judgements that favoured spontaneity were expressed with roughly the same frequency and strength as judgments that favoured self-control. When asked to express judgements about what was important in life, most participants maintained both that it was important to make long-term plans and stick to them and that there was no harm in occasionally taking small enjoyments rather than sticking to those plans. We found strong evidence of individual heterogeneity. Because of this, we defined an ordinal scale for measuring individuals' judgements about the relative importance of self-control and spontaneity in relation to everyday conflicts between enjoyment and health-related goals. Using this scale, we found little evidence of contextdependence.
We conclude that identifying when and where individuals want to be helped to avoid self-control failures is more difficult -both empirically and conceptually -than many behavioural economists seem to think. We believe that our findings point to the importance of treating desires for spontaneity as equally deserving of attention as desires for self-control, and as suggesting interesting lines of further research.
As a preliminary to describing our study, we briefly discuss some of the ways in which the concept of self-control failure is used in behavioural economics, and review some existing evidence about people's everyday experience of self-control problems.

Self-control failure in behavioural economics
The idea that deviations from neoclassical rationality can be explained in terms of selfcontrol failure is common in behavioural economics. Such explanations are not confined to modes of behaviour that might reasonably be viewed as addictions, such as problem drinking, compulsive gambling or reckless borrowing. For example, the possibility of self-control failure has been invoked in explanations of inconsistencies in time preference , of why people buy gym memberships when it would be cheaper to pay for each visit separately (Della Vigna and Malmendier, 2006), 4 of why shoppers use inconvenient entrances to stores rather than walk past charity collectors (Andreoni, Rao and Trachtman, 2017), and of why experimental subjects show extreme risk aversion when choosing between low-prize lotteries (Fudenberg and Levine, 2006).
In two founding manifestos of behavioural welfare economics, Camerer et al. (2003Camerer et al. ( : 1217Camerer et al. ( , 1238Camerer et al. ( -1242 and Sunstein and Thaler (2003: 1168, 1184 cite inconsistencies in time preference as examples of self-control failures that public policy should aim to counter. The emphasis on self-control failure is particularly marked in Sunstein and Thaler's paper, which uses the composite phrase 'bounded rationality and bounded self-control' as a generic term for the class of errors or biases that individuals should be helped to avoid. Or, as Thaler (2003: 1162) put it in a programmatic passage, their concern is with cases in which 'individuals make inferior decisions in terms of their own welfare -decisions that they would change if they had complete information, unlimited cognitive abilities, and no lack of self-control'. When they present their now-famous cafeteria story, they say that their proposal that the fruit should be placed before the desserts on the cafeteria counter is designed 'to help people to solve their self-control problems' Thaler, 2003: 1184).
In their later book Nudge, Thaler and Sunstein repeatedly emphasise that their recommendations are designed to 'make choosers better off, as judged by themselves ' (2008: 5, italics in original; see also 10, 12, 80). The implication is that Sunstein and Thaler's concept of self-control failure involves acting contrary to one's own judgement. But when they set out the justification for nudging people towards healthy eating, their main arguments are not based on evidence about people's actual judgements: Consider the issue of obesity. Rates of obesity in the United States are now approaching 20 per cent, and more than 60 per cent of Americans are considered either obese or overweight. There is overwhelming evidence that obesity increases risks of heart disease and diabetes, frequently leading to premature death. … We do not claim that everyone who is overweight is necessarily failing to act rationally, but we do reject the claim that all or almost all Americans are choosing their diet optimally… With respect to diet, smoking, and drinking, people's current choices cannot reasonably be claimed to be the best means of promoting their wellbeing. Indeed, many smokers, drinkers and overeaters are willing to pay third parties to help them make better decisions. (p. 7) Notice that, except in the final sentence (which surely applies only to a small minority of the relevant population), the evidence that Thaler and Sunstein present is about the harmful effects of obesity, not about the mental processes that underlie the choices that cumulatively cause it. As Sugden (2017) points out, self-control failure is only one of a wide range of psychological mechanisms that are potential explanations of an individual's tendency to eat unhealthy but enjoyable food. Many of these mechanisms -for example, procrastination, social proof, availability, cognitive dissonance, intrapersonal empathy gaps and selfdeception -do not depend on the assumption that, in the individual's own judgement, there is a conflict between his immediate desires and his long-term well-being. Because what is at issue here is not whether people's choices are rational or optimal or the best means of promoting their well-being, but whether they are the result of self-control failure, interventions that are designed to make people better-off as judged by themselves must be based on information about people's actual judgements.
Many behavioural-economic discussions of self-control problems use the distinction between System 1 and System 2 mental operations, as proposed by Wason and Evans (1975) and developed by Kahneman (2003). System 1 operations are fast and automatic; System 2 operations are slow and under conscious control. This distinction is used by Thaler and Sunstein (2008: 19-39) and Kahneman (2011) as a way of organising behavioural findings.
In both cases, the suggestion is that System 1 is liable to induce preferences and judgements that are systematically biased and that System 2 is capable of correcting. As Kahneman and Sunstein (2006: 92) put it: 'System 1 quickly proposes intuitive answers to judgment problems as they arise, and System 2 monitors the quality of these proposals, which it may endorse, correct, or override'. Self-control problems can be fitted into this framework by treating a person's immediate desires as unmediated products of System 1, and her judgements about her goals as products of System 2 operations (to which System 1 may have provided inputs). However, we see no good reason to assume that System 2 is always able to produce context-independent judgements. (Recall the possibly analogous finding that judgements about life satisfaction can be context-dependent.) Nor is it necessarily the case that, as Kahneman (2011: 21) suggests, '[w]hen we think of ourselves, we identify with System 2, the conscious, reasoning self that has beliefs, makes choices, and decides what to think about and what to do'. A person who values spontaneity might sometimes identify with her System 1 feelings, and override goal-directed thoughts issuing from her System 2. Hofmann et al. (2012aHofmann et al. ( , 2012b) report a seminal study which uses experience sampling to elicit individuals' experiences of desires over the course of a week of ordinary life. Each participant was asked to carry a smart phone for seven consecutive days. Each day, the participant received seven calls, at randomised intervals, and reported desires that she was currently feeling or had felt in the last half hour. For each of these desires, she reported: (i) its content, (ii) its strength, (iii) whether she had attempted to resist it, (iv) whether she had enacted it, (v) whether it conflicted with a personal goal, and if so, (vi) the goal with which it conflicted. This study can be interpreted as sampling individuals' everyday experiences of self-control problems, and the authors themselves (2012a: 1319) interpret it in relation to a four-stage psychological model of 'intentional' and 'effortful' self-control. In the first stage, the individual experiences some desire. In the second stage, she assesses whether the desire conflicts with a goal. If she recognises such a conflict, the desire is problematic. If so, there is a third stage in which she decides whether to resist the desire. If she decides to resist, she attempts self-control. If so, there is a fourth stage in which the attempt may succeed (the desire is not enacted) or fail (it is enacted).

Evidence of everyday self-control
Delaney and Lades (2017) report a study that is closely modelled on that of Hofmann et al., but uses the day reconstruction method (Kahneman et al., 2004) in place of experience sampling. Each participant was first asked to complete a time-use diary of the previous day, breaking the day down into a set of discrete episodes, and then responded to a series of questions about each episode. Participants reported up to three desires that they had felt during the episode. For each desire, they answered questions similar to those used by Hoffman et al. to elicit properties (i) to (vi).
The two studies produced broadly similar results. Hofmann et al.'s 205 participants reported a total of 7827 desire episodes, of which 47 per cent were problematic. Of all desires (problematic or not), 42 per cent were actively resisted. Of those that were resisted, 17 per cent were enacted (Hofmann et al., 2012a(Hofmann et al., : 1325. 5 Delaney and Lades's 142 participants reported a total of 2059 desires. 69.1 per cent of these desires were problematic. Participants attempted to resist 56.1 per cent of problematic desires. In 35.1 per cent of cases in which such an attempt was made, the attempt failed (p. 1161). Expressing Delaney and Lades's data in more stylised form: on an average sampled day, an average participant faced ten self-control problems, attempted self-control five or six times, and experienced selfcontrol failure twice. Hofmann et al. (2012b: 584-585) report that the categories of goals that most commonly conflicted with desires were 'health-related' (directed at health or fitness) and 'time use' (directed at using time efficiently and getting things done); self-control failure rates were highest for desires for 'engaging in media use' and for 'work' (not, interestingly, for avoiding work). In Delaney and Lades's study, the most common selfcontrol failures were in relation to 'postponing a task' and 'using social media' (p. 1161).
The most frequently reported self-control problems might seem rather mundane (Hofmann et al. [2012b: 585] express mild surprise at not finding more evidence of 'disastrous failures to control sexual impulses and urges to spend money'), but this is likely to be the effect of the everyday sampling frame. It seems clear that self-control problems, successes and failures are all common features of everyday life. To this extent, the evidence is consistent with a System 1/System 2 model in which automatic processes produce desires which cognitive processes sometimes but not always override. However, the observation of self-control attempts does not imply that individuals' judgements about their goals, or about the importance of being goal-directed, are stable or context-independent. Indeed, Hofmann et al. (2012aHofmann et al. ( : 1328Hofmann et al. ( -1329 report some evidence to the contrary. Some of their participants responded to questions about the presence or absence of other people when the desires were experienced. The presence of other people had no significant effect on the strength of desires or on perceptions of conflict between desires and goals, but individuals were more likely to attempt to resist highly-conflicted desires if other people were around. There was also a tendency for the presence of other people to inhibit the enactment of desires in general, whether problematic or not. In relation to our research questions, it is particularly interesting that many problematic desires (44 per cent in Delaney and Lades's study, possibly around 30 per cent in Hofmann et al.'s -see note 5) were not resisted. These were cases in which a participant recognised that a current desire conflicted with a personal goal, but chose not to attempt selfcontrol. In other words: acting contrary to a self-acknowledged goal is not equivalent to a failure of self-control.

Our survey design
Our survey was designed to investigate individuals' judgements about the personal importance of being disposed to resist desires that conflict with goals. We focused on conflicts between desires for immediately enjoyable activities and goals for good health -a domain in which, as evidenced by Hofmann et al.'s findings, conflicts between desires and goals are particularly common.
It is important to recognise that our objective was not to elicit individuals' current preferences for or against imposing constraints on their future choices. We take as given that, as implied by Planner/Doer models and as shown in the everyday self-control studies reviewed in the preceding section, people's preferences with respect to self-control are often temporally inconsistent. Acting on the preferences of her Planner, a person may choose to impose constraints on her possible future actions, but when the future arrives, she may act on the preferences of her Doer in trying to evade those constraints. Our intention was not to add to existing knowledge about the content of these two types of preferences: it was to investigate how people adjudicate between them -how far they identify with their Planners, and how far with their Doers. Participants were compensated for the time spent in answering the survey questions but, since the purpose of those questions was to elicit individuals' personal judgments about their own lives, there was no way of rewarding 'correct' or 'successful' responses.
To allow an investigation of context-dependence (or independence), survey respondents were divided randomly between two treatments. All respondents answered the same questions about attitudes to self-control and spontaneity. In one treatment, these questions were preceded by a task that was intended to direct attention to the enjoyable properties of activities that might conflict with health-related goals. In the other, they were preceded by a task that was intended to direct attention to the sense of satisfaction derived from achieving health-related goals through activities that might not be immediately enjoyable.
In pre-registering the study, we did not propose any hypothesis about the direction in which individuals' judgements about the importance of self-control might be differentially affected by cues that directed their attention either to enjoyment or to goals. The most obvious possibility is that cues that direct attention to enjoyment induce judgements that give a low weight to self-control. But we recognised an alternative possibility: thoughts about enjoyment might act as reminders to exert self-control, and so induce judgements favouring self-control. Either effect, if observed, would cast doubt on the idea that individuals have stable attitudes to self-control problems. Because each participant's attention has to be directed in only one direction, we must use between-subject tests. A corollary of this is that there is little scope for investigating whether some individuals are susceptible to one effect and some to the other.
The survey had three parts -an attention-focusing task, a judgement elicitation task, and a short questionnaire. We used a pre-registered pilot study to determine the form of the attention-focusing task. 6 The design of the pilot was very similar to that of the main study, except that there were two pairs of attention-focusing treatments rather than the single pair used in the main survey. In the pilot, 241 participants were randomly distributed between the resulting four treatments; 229 completed the survey. One pair of treatments used memory recall tasks. In the memory/enjoyment treatment, participants recalled memories relating to enjoyments that might be thought to conflict with health-related goals. In the memory/goals treatment, they recalled memories of being satisfied about efforts to achieve health-related goals. We will describe these treatments in more detail later. The other pair of treatments used pictures to activate thoughts about enjoyment or long-term goals. In the pictures/enjoyment treatment, participants were shown fourteen pictures, comprising six enjoyable foods, six enjoyable activities, one healthy food, and one healthy activity, and for each picture answered 'How much would you enjoy this?' on a 7-point scale. In the pictures/goals treatment, participants were shown fourteen pictures, comprising six healthy foods, six healthy activities, one enjoyable food, and one enjoyable activity, and for each picture answered 'How good do you think this would be for your health?' on a 7-point scale. We pre-committed to using whichever type of attention-focusing task induced more significant differences (irrespective of direction) in responses to the judgement elicitation task. The judgement elicitation task comprised 32 questions divided by topic into four blocks -'self-control wishes', 'spontaneity wishes', 'regrets' and 'importance in life'-each of eight questions. When pictures were used, there was no significant treatment effect for any of the four blocks. When memoryrecall tasks were used, there were significant effects (higher scores on a spontaneity/selfcontrol scale in the memory/goals treatment) for two blocks (regrets and importance in life), and no significant effects for the other two blocks. Following our pre-registered plan, we used the memory recall tasks in the main survey, which we now describe.
The survey was programmed in oTree  and conducted online using the Prolific platform. We recruited 240 participants (90 male and 150 female), all of whom were UK residents with English as their first language. The median age was 31. Each participant received £2 for participating. The median duration of the survey was less than ten minutes. To ensure good quality responses to tasks that were necessarily non-incentivised, the survey was designed to be short, easy to understand, and engaging. The content of the survey, the hypotheses and the analysis plan were pre-registered. 7 The three parts of the survey are described next, in the order in which they were faced.

The attention-focusing task
For this task, each participant was randomly assigned to one of two treatments, designed to activate thoughts either about immediate enjoyment or about long-term goals. The randomisation ensured that there were 120 participants in each treatment. In the enjoyment treatment, participants were asked to think about 'a memorable meal you have had (e.g. at a restaurant, at a friend's house) when you particularly enjoyed the food'. They were then asked to type answers to four questions about the meal: where it took place, who was there, which dish the respondent enjoyed most, and (with the added request to be 'as detailed as you can') why they enjoyed it. Finally they reported how much they had enjoyed the dish, on a seven-point Likert scale. In the goals treatment, participants were asked to think about 'some effort that you have made (e.g. in relation to exercise or diet) that was good for your health and that you felt satisfied about having done'. They were then asked to type answers to four questions about the effort: what they did, when they did it, why it was good for their health, and (with the added request to be 'as detailed as you can') why they felt satisfied about it. Finally they reported how satisfied they had been, on a seven-point Likert scale.
Although this task might be regarded as a 'priming' manipulation, it is significantly different from the types of task typically used in psychology to test hypotheses about priming. For example, in a widely-cited experiment which built on the early work of Higgins et al. (1977) and Srull and Wyer (1979), Bargh et al. (1996) used tasks in which subjects were asked to unscramble sentences. The content of these sentences varied; some were related to rudeness, some were related to politeness, and some were neutral. The key finding was that, in a second stage of the experiment that was not overtly linked to the unscrambling task, subjects who had unscrambled sentences about rudeness were more likely than the others to interrupt a conversation. Many subsequent experiments have used scrambled sentences or word search tasks as manipulations. 8 These experiments test the hypothesis that, as a result of psychological processes below the level of consciousness, incidentally presented words or sentences activate mental constructs that influence subsequent judgements or behaviour. In contrast, our design investigates the effects of conscious processes of memory retrieval, reasoning and judgement. In our judgement elicitation task, participants need to think about the relative importance to them of self-control and spontaneity. In striking such a balance, it really is relevant to recall previous experiences of conflict between desires and goals, previous feelings of enjoyment from acting contrary to one's goals, and previous feelings of satisfaction from exercising self-control. Our memory recall tasks were intended to direct a participant's attention towards one or other of two sets of relevant considerations, aligned with either self-control or spontaneity. In this respect, they are similar to the manipulations used by Schkade and Kahneman (1998) to investigate 'focusing illusions'. Importantly, they activate thoughts about temptations, but do not present actual temptations. To put this another way, our objective was to elicit cool judgements about situations which, if actually experienced, would be liable to activate hot emotions.

The judgement elicitation task
In this part of the study, every participant answered the same 32 questions. In each case, the respondent was shown a statement and asked 'How well do you recognise yourself in this statement?' Responses were recorded on a seven-point Likert scale ranging from 'not at all' to 'extremely well'. For the purposes of the design, the questions were pre-classified into four blocks, each of eight questions. However, this structure was not revealed to participants, who simply faced the 32 questions in a single sequence, randomised independently for each participant. The statements are listed below, grouped into the four blocks and numbered for ease of reference. Sixteen of these statements express attitudes favouring self-control; the other sixteen (marked with asterisks) express attitudes favouring spontaneity.

Self-control wishes
Q1. I wish I ate more fruit and vegetables.
Q2. I ought to drink more mineral or tap water.
Q3. I wish I drank less sugary drinks.
Q4. I ought to eat less high-fat food.
Q5. I wish I took more exercise. Q27. It's important to make long-term plans and stick to them.
Q28. In life it's important to be able to resist temptation.
Q29.* There's no harm in occasionally taking a break from an exercise routine.
Q30.* Having occasional treats is an important source of happiness for me, even if they are bad for my health.
Q31.* I like my plans to be flexible and leave plenty of room for spontaneity. Q32.* Tomorrow will look after itself -each day has troubles enough of its own.
The self-control wishes can be paraphrased as 'I wish I exercised (or I ought to exercise) more self-control'. They compare the individual's actual degree of self-control with a counterfactual state in which that degree is higher. They are examples of the kind of statement that Thaler and Sunstein (2008: 107) interpret as indicating that the individuals who make them 'are open to a nudge' and 'might even be grateful for one'.
The spontaneity wishes have the opposite content: 'I wish I exercised less selfcontrol'. Agreeing with these statements carries the suggestion that being nudged towards self-control might be unwelcome.
The regret statements Q17-Q20 can be paraphrased as 'After taking enjoyment rather than attempting self-control, I feel regret'. Such statements are often interpreted as ex post reports of self-control failure. In a Planner-Doer model, they might be interpreted as expressing a person's identification with their inner Planner. Q21-Q24 have one of two opposite contents: either 'After exercising self-control rather than taking enjoyment, I feel regret' (Q21 and Q22) or 'After taking enjoyment rather than exercising self-control, I do not feel regret' (Q23 and Q24). They might be interpreted as expressing identification with an inner Doer (either by regretting an earlier decision in favour of self-control, or by approving an earlier decision not to exercise self-control). Notice that the regret statements differ from the two types of wish statement in not using the respondent's current lifestyle as a reference point. Thus, a person who believed that her current degree of self-control was about right might disagree with the wish statements while recognising the regret statements as descriptions of her feelings on occasions when she exercised too little or too much selfcontrol.
The importance in life statements are more reflective or philosophical than the others. They express judgements about the overall importance of self-control and spontaneity in the respondent's life. Q25-Q28 have the connotation 'In life, it's important to exercise selfcontrol'; statements Q29-Q32 have the contrasting connotation 'In life, it's important to be spontaneous'. One might say that the first set of statements expresses a Planner's view of life, while the second set expresses a Doer's.

The questionnaire
In the final questionnaire, participants reported their age and gender and whether they followed any specific dietary regime. Participants also answered two questions in which they were asked to rate the healthiness of their own habitual diet and exercise: 'Relative to an average person of your age, would you say that your diet was: much less healthy … much healthier?' and 'Relative to an average person of your age, would you say that you take: much less exercise … much more exercise?' Each of these questions elicited responses on a seven-point Likert scale. Our intention was to use participants' responses to these questions as control variables in our tests for differences between the enjoyment and goals treatments.

Responses to the memory recall tasks
We reviewed the responses to the memory recall task to check that our manipulation had worked as intended -that participants had taken the task seriously and had recalled specific experiences of enjoyable dishes or of satisfaction from healthy activities. In considering each treatment, we gave particular attention to the question to which participants had been asked to be as detailed as they could.
In both treatments, all participants followed the instructions by describing some specific meal or activity, and gave relevant responses to all the memory recall questions. The median length of answers to the 'details' question was 18 words in the enjoyment treatment and 19.5 words in the goals treatment (the respective means were 21.4 and 23.3). Some respondents wrote whole paragraphs, but even the short answers gave cogent reasons for feeling enjoyment (for example, 'I love lamb cooked over long time', 'Love puddings, sweet and tasty') or satisfaction ('Lost weight and felt happier', 'Felt energised'). Participants' ratings of remembered enjoyment or satisfaction on the 1-7 Likert scale were very high (the mean ratings were 6.69 for enjoyment and 6.23 for satisfaction).
In answering the 'details' question in the enjoyment treatment, most participants wrote (often quite lyrically) about the enjoyable features of the dish. For example, on a meal eaten at home: 'I enjoyed this meal because the salmon was very moist and succulent. We also had potato and leek gratin and broccoli'. On smoky BBQ ribs eaten at a restaurant: 'It has a lot of flavour to it and the meat just falls off the bone. They are big ribs too.' In the goals treatment, almost all respondents described efforts concerning either diet or exercise. In descriptions of how the effort had been satisfying, two themes were particularly prevalent. One was pleasure in feeling or looking healthier. For example, on a workout every morning: 'It boosted my mood and made me less tired throughout the day. It also made me feel fitter.' On going vegan: ' I got thinner and felt healthier. It was great looking in the mirror and seeing myself as well as feeling less tired.' The other theme was self-empowerment. For example, on losing weight: 'Because I felt like I had achieved something by using my willpower'. On exercise: 'Because it made me feel better about my self'.
Overall, these responses are coherent and psychologically credible, and clearly unaffected by the lack of direct incentives. It seems clear that the attention-focusing task was engaging and activated real memories of personally significant episodes of enjoyment and self-control.

Tests for context-dependence
Following a pre-registered plan for the analysis of the data, we coded each participant's response to each of the 32 questions from 1 to 7 on a spontaneity/self-control scale, reverse coding the statements favouring spontaneity. So, on this scale, 1 corresponds with 'not at all' recognising oneself in a statement favouring self-control or recognising oneself 'extremely well' in a statement favouring spontaneity; 7 corresponds with the opposite extremes. The central point of the scale is 4. Our prior expectation was that, for each participant considered separately, there would be positive correlation between responses (coded as we have described) to any pair of questions, legitimating the use of an individual's mean score for the whole set of 32 questions as an index of their overall judgements about spontaneity and selfcontrol (interpreted as opposites of one another on a single psychological scale). However, we recognised that each of the four blocks of questions tapped into a different aspect of spontaneity/self-control judgements, and hence that individuals' mean scores for each of the four separate blocks might have additional information content. Our pre-registered tests for context-dependence in spontaneity/self-control judgements use non-parametric and regression methods to compare overall and block-specific scores between the two treatments.
Mean scores, averaging over the 120 participants in each treatment, are shown in Table 1. The final column of the table reports the p-value of a two-tailed Mann-Whitney test for differences between the distributions of responses between the two treatments. Overall, there is no significant difference between the treatments. Only one of the four blocks (regrets) shows a significant difference, with higher scores in the goals treatment; even here, the effect size is small and there is significance only at the 10 per cent level.
[ Table 1 near here] Table 2 reports OLS regression results. 9 In each regression equation, the dependent variable is a participant's mean score on the spontaneity/self-control scale, either for all 32 questions, or for one of the blocks of eight questions. The treatment variable is a dummy which takes the value 1 in the goals treatment. There are controls for age (in years), 10 gender (with female as the baseline), and for responses to the diet and exercise questions in the final part of the survey, with the least (most) healthy diet and least (most) exercise coded as 1 (7).
Consistently with the Mann-Whitney tests, there are no significant treatment effects. Diet and exercise have significantly negative effects on scores for self-control wishes and generally positive effects on scores for the other blocks. We conjecture that this pattern is the result of two mechanisms that work in opposite directions. Positive attitudes to self-control are likely to be associated with following healthy regimes of diet and exercise. (Causation could go in either direction. An inclination to exercise self-control might be a cause of healthy lifestyles. Alternatively, people with healthy lifestyles might attribute those lifestyles to their powers of self-control, and this might induce self-serving approval of self-control.) However, people with healthy lifestyles may believe they already exert high levels of selfcontrol and might approve of this without wishing to exert more self-control than they do. If such people do not identify with Q1-Q8, they may give relatively low-scoring responses. 11 We found no significant interaction between treatment and either of these lifestyle variables.
[ Table 2 near here] Two of the regression equations show a significant gender effect: men express more favourable attitudes to self-control than do women. Following up this finding, we investigated whether our attention manipulation had different effects for male and female participants. Table 3 breaks down the data in Table 1 by gender. For male participants, spontaneity/self-control scores were significantly higher in the goals treatment than in the enjoyment treatment for the 32 questions overall, for the importance in life block and (at the 10 per cent significance level) for the regrets block. For female participants, there was no significant treatment effect overall; in the importance in life block, the goals treatment induced significantly lower scores. We do not want to make too much of this evidence as the survey was not planned with any intention of investigating gender differences.
[ Table 3 near here] As we have explained, one of the objectives of our study was to investigate whether judgements about the relative importance of spontaneity and self-control are differentially influenced by cues that direct attention to immediate enjoyment or to long-term goals. We found no firm evidence that attention-directing cues have systematic effects on such judgements. Given that this was a study with pre-registered hypothesis tests, it is appropriate to interpret its results in Popperian terms. 12 The null hypothesis was that judgements about spontaneity and self-control are context-independent. Our design tested that hypothesis in a situation in which there was prior reason to expect it to be falsified, but in fact it was not. This result does not establish the null hypothesis as true, but (according to Popperian methodology) one can be more confident in a hypothesis, the more successfully it withstands attempts to falsify it.
As explained in Section 3.1, our memory-recall tasks were not priming manipulations expected to work through subconscious channels. They were intended to activate, and did indeed activate, conscious processes of relevant memory retrieval, reasoning and judgement. Thus, our results should not be interpreted merely as corroboration of recent scepticism about the replicability of priming experiments (e.g., Kahneman, 2012). Our experimental design was based on the premise that, if individuals' judgements about the relative importance of spontaneity and self-control were characterised by a considerable degree of contextdependence, our memory-recall tasks would exert measurable influences. 13 It should be remembered that our aim was to investigate the effects of attention-focusing cues on individuals' cool judgements. Context-independence of such judgements may coexist with significant shifts of preference between hot and cold emotional states. Of course, there are severe constraints on the kinds of manipulations that can be used in online or laboratory experiments. As suggested by an anonymous referee, it could be that the phrasing of the judgement elicitation questions encouraged participants to think of their attitudes to selfcontrol and spontaneity as fixed traits.

The relative importance of spontaneity and self-control
As explained in Section 5, we designed the judgement elicitation questions in the hope that spontaneity/self-control scores for the 32 individual questions could be combined into a reliable scale for measuring individuals' judgements about the importance of exercising selfcontrol. Failing that, we hoped to define such scales for the separate categories corresponding with our four blocks of questions, or for combinations of those. In this section, we discuss how far we succeeded and what can be learned by organising individuals' responses in this way. Because we did not find differences between the two treatments, for the purposes of this analysis we will pool all the data.
The relevant data are summarised in Table 4. The 'mean score' column of this table shows the mean and standard deviation of responses to each of the 32 questions. The next column reports the item-rest correlations generated by treating all 32 responses as items in a single scale. It is immediately clear that each of the items in the self-control wishes, spontaneity wishes and (with the exception of Q24) regrets blocks is strongly and positively correlated with the other items in the overall scale. In contrast, the importance in life items have generally low (and sometimes negative) item-rest correlations. This pattern suggests that our aim of expressing all elicited judgements on a single scale was too ambitious.
[ Table 4 near here] The 'Q1-Q24' column of Table 4 uses the responses from the first 24 questions to create a single scale. Notice that this scale preserves equality between the number of items that favour spontaneity and the number that favour self-control. 14 The consistently high and positive item-rest correlations and the Cronbach's alpha value of 0.837 give us confidence that this is a reliable scale for measuring attitudes that people express in relation to concrete activities, events and feelings that recur in everyday life. We will refer to this as a scale of everyday spontaneity/self-control.
In contrast, we suggest that Q25-Q32 are picking up more 'philosophical' judgements about spontaneity or self-control, considered in general. The final column of Table 4 uses the responses to Q25-Q32 to create a spontaneity/self-control scale that is related to such judgements. We will call this the importance in life scale. The item-rest correlations are all positive, but generally lower than those for the everyday scale, and the Cronbach's alpha value of 0.527 is rather low. Our tentative conclusion is that Q25-Q32 may not be picking up a single attitude. There is no significant correlation between individual scores on the 'everyday' and 'importance in life' scales (Spearman's rho = 0.074, p = 0.255). Because our everyday spontaneity/self-control scale combines responses to equal numbers of items favouring spontaneity and items favouring self-control, it is a reasonable starting point for analysis to interpret the mid-point of the scale as expressing an equal degree of approval of spontaneity and self-control. For example, consider a participant who recognises herself equally well in the statements 'After ordering desserts in restaurants, I often feel regret' (Q18) and 'After ordering a healthy dish, I often wish I'd chosen something tastier' (Q21). The first statement expresses regret about having chosen enjoyment when the alternative was self-control; the second expresses regret about having exercised self-control when the alternative was enjoyment. It is natural to interpret these statements as equal and opposite. Of course, many of the statements we use cannot be paired in such an obvious way, and even Q18 and Q21 are not exact opposites. This is an unavoidable property of a survey instrument that presents respondents with informal and engaging statements about everyday life. We can only say that we did our best to ensure balance between statements in favour of spontaneity and statements in favour of self-control. Figure 1 shows the distributions of the scores for each of the four blocks and for the everyday spontaneity/self-control scale obtained by aggregating across the first three blocks. For each participant, we averaged their responses to the questions belonging to the relevant block. The histograms report the frequencies of the corresponding mean scores. The left-hand side of each diagram (score < 4) indicates that, on average, the participant leans towards spontaneity; the right-hand side (score > 4) that they lean towards self-control.
[ Figure 1 near here] Averaging over Q1-Q24, the mean response was 4.17, not far above the mid-point of the scale. Average responses for self-control wishes (4.29) and spontaneity wishes (4.35) were close to one another and slightly above the mid-point; average responses for regrets (3.88) were slightly below it. Some statements consistently elicited responses at the high end of the scale -for example, support for 'I wish I took more exercise' (Q5, average score 5.16), or lack of support for 'If I choose to walk because it's healthy, I often regret it along the way' (Q22, average reverse-coded score 5.67). Other statements consistently elicited responses at the opposite end of the scale -for example, lack of support for 'I wish I drank less sugary drinks' (Q3, average score 3.03) and 'After ordering desserts in restaurants, I often feel regret' (Q18, average score 2.71). Overall, the evidence from the 'everyday' questions suggests that our participants were about as likely to identify with statements favouring spontaneity as with statements favouring self-control.
What can we learn from the importance in life questions? The mean response to Q25-Q32 was 3.77, slightly below the mid-point of the scale. Four of these questions are particularly interesting because they elicit responses towards the two ends of the scale. There was strong support for 'It's important to make long-term plans and stick to them' (Q27, average score 4.90) and 'In life it's important to be able to resist temptation' (Q28, average score 4.79). These are direct statements of the value of self-control as a general principle or life strategy. But there was even stronger support for 'There's no harm in occasionally taking a break from an exercise routine' (Q29, average reverse-coded score 2.61) and 'Having occasional treats is an important source of happiness for me, even if they are bad for my health' (Q30, average reverse-coded score 2.13). It seems that a typical person's philosophical attitude to self-control can be expressed in proverbial (or fortune cookie) form as: There's a time for keeping resolutions, and a time for breaking them. Figure 1 also reveals a considerable degree of heterogeneity in people's judgements. In all four blocks, we see that there are people with strong preference for self-control, as well as people with strong preference for spontaneity. Obviously, because each subscale is obtained by averaging (at least) eight responses, intermediate values are more common than extreme ones. This is also why the 'everyday' scale (which averages over twenty-four as opposed to eight responses) has a less spread out distribution. It is notable that some scores (e.g., spontaneity wishes) show a higher degree of individual-level heterogeneity than others (e.g., regrets). Even for our most reliable 'everyday' scale, the degree of heterogeneity is substantial. Although a sizeable majority of respondents lean towards self-control, a significant minority displays an overall preference for spontaneity. This suggests that policies based on a one-size-fits-all approach that assumes a universal desire for self-control may not do justice to the preferences of many.

Conclusion
Sunstein and Thaler's New Year's Resolution Test is an informal expression of an idea that is widely used in discussions of behavioural public policy. This idea is that, in many everyday decisions about such matters as diet and exercise, individuals experience self-control failures -they face conflicts between desires for immediate enjoyment and commitments to longterm goals, try to resist the temptation to act on those desires, but fail in the attempt. By investigating individuals' self-acknowledged goals, it is suggested, behavioural economists can identify self-control failures and design policy interventions to counter theminterventions that can be justified as enactments of individuals' own judgements.
Our findings are consistent with a more nuanced understanding of the psychology of self-control. Conflicts between desires and goals do indeed appear to be a fundamental feature of human life, but it would be a mistake to assume that people's considered judgements always favour their self-acknowledged goals over conflicting desires. Just as people may value the exercise of self-control in achieving their goals, so also they may value spontaneity in responding to their desires. Our findings suggest that spontaneity and selfcontrol can be interpreted as opposite directions along a single scale. We have found considerable heterogeneity in individuals' judgements about the relative importance of spontaneity and self-control, with a large minority of our respondents leaning more towards spontaneity. We have not found evidence that such judgements, when elicited from individuals in cool emotional states, are context-dependent.
As we have acknowledged from the outset, our findings are those of an exploratory study. We see them as opening up an agenda of under-researched questions about attitudes to spontaneity. One obvious question concerns the robustness of our finding that those attitudes are not subject to significant attention-focusing effects. To answer that question, it would be useful to investigate wider ranges of long-term goals and attention-focusing cues. It would also be useful to investigate the possibility, tentatively suggested in Section 5, of gender differences in attitudes to spontaneity and self-control: men's attitudes may be more favourable to self-control, and more susceptible to attention-focusing cues, than women's. A further possibility is that some kinds of deviation from long-term goals are viewed as more spontaneity-affirming than others. For example, recall the contrast between our respondents' spontaneity-favouring attitudes to sugary drinks and restaurant desserts and their self-controlfavouring attitudes to exercise. Breaking a health-oriented resolution by ordering a crème brûlée is perhaps a more positive way of expressing spontaneity than not taking one's daily run on a wet day.
To our knowledge, the idea that spontaneity might be valued does not appear explicitly in Sunstein and Thaler's writings, or in other work in behavioural economics that seeks to remedy self-control problems. However, as a footnote to a discussion of diet and obesity, Thaler (2003: 1167, note 19) discuss the related value of autonomythat 'people are entitled to make their own choices even if they err'. Recognising that idea as a possible objection to their proposals for nudging, they say: 'We do not disagree with the view that autonomy has claims of its own, but we believe that it would be fanatical, in the settings we discuss, to treat autonomy, in the form of freedom of choice, as a kind of trump, not to be overridden on consequentialist grounds'. An advocate of libertarian paternalism might make a similar argument about spontaneity -that spontaneity has moral claims of its own, but that it would be fanatical to claim that, if the pleasures of spontaneity are taken into account, most obese people are choosing the best means of promoting their well-being. If the criterion on which public policies are justified is the maximisation of overall well-being, the truth or falsity of such claims is clearly important. But our paper is about the idea that public policies can be justified on the grounds that they help individuals to overcome selfacknowledged self-control problems. If that idea is to be used as a guiding principle, we need to be assured that those individuals want to be helped.