Behavioral interventions (BIs) – sometimes called nudges – use behavioral science to generate a change in behavior without fundamentally changing the incentive structure of the context in which decisions are made (see Osman et al., Reference Osman, Fenton, Pilditch, Lagnado and Neil2018; see also Oliver, Reference Oliver2013). BIs can be used for many ends (e.g., to conserve the environment, to get people to pay their taxes on time or to promote health and wellbeing). Examples of BIs in health and wellbeing include changing the default on pensions, so that a portion of an employee's salary is put into retirement saving unless they opt out, and changing the size of glassware in pubs to encourage people to drink less.Footnote 1 All over the world, BIs are being used in public policy in domains including health, finance, consumer protection, education, energy, the environment, transport, taxation, telecommunications, public service delivery and the labor market (World Bank, 2015; OECD, 2017).
In some situations where BIs are used, people have a clear interest in the behavior of others. For instance, when we face a problem of social cooperation such as conserving the environment or a ‘negative externality’ (a cost incurred by a third party) such as second-hand smoke, then BIs can encourage people to behave pro-socially. But in some cases, BIs are supposed to promote the self-interest of the recipients, such as when implemented in the context of health behaviors. One prominent argument for using BIs that promote health and wellbeing is that they “make choosers better off, as judged by themselves” (Thaler & Sunstein, Reference Thaler and Sunstein2008, p. 5). This is an empirical claim that needs to be judged in the light of the evidence.
Recent work suggests that the majority of the public find BIs acceptable (Hagman et al., Reference Hagman, Andersson, Västfjäll and Tinghög2015; Jung & Mellers, Reference Jung and Mellers2016; Petrescu et al., Reference Petrescu, Hollands, Couturier, Ng and Marteau2016; Reisch & Sunstein, Reference Reisch and Sunstein2016; Reisch et al., Reference Reisch, Sunstein and Gwozdz2016; Osman et al., Reference Osman, Fenton, Pilditch, Lagnado and Neil2018; Venema et al., Reference Venema, Kroese and De Ridder2018). However, people may approve of BIs because they hope that they will change other people's behavior. Studies show that people's support for BIs is higher when they are given a justification of the policy in terms of its effects on ‘people’ in general rather than when they are given a justification in terms of its effects on ‘you’ (Cornwell & Krantz, Reference Cornwell and Krantz2014). People also think that BIs will be more effective for others than for themselves and their judgments of the acceptability of BIs are predicted by how effective they anticipate BIs will be on others’ behavior, whereas the evidence is mixed as to whether acceptability judgments are predicted by how effective BIs will be at changing their own behavior (Bang et al., Reference Bang, Shu and Weber2018). In these cases, people may regard the ill health of others as imposing an externality on them through the economic costs of ill health, which may increase insurance premiums or may require increased government spending, especially in countries with socialized medicine (Gold, Reference Gold2018). Alternatively, it may be that people have ‘meddlesome preferences’ – preferences about how other people behave in domains where everyone should be free to make their own decisions (Sen, Reference Sen1970; Blau, Reference Blau1975). Indeed, a systematic review of the acceptability of government interventions to change health-related behaviors found that support for the interventions was highest among those not engaging in the targeted behavior (Diepeveen, Reference Diepeveen, Ling, Suhrcke, Roland and Marteau2013).
In order to judge whether BIs make choosers better off, as judged by themselves, we need evidence that directly targets that claim. There is debate about how exactly to cash out the claim (Sugden, Reference Sugden2017, Reference Sugden2018; Sunstein, Reference Sunstein2018), but a first start is to investigate whether people support BIs as a method to change their own behavior. Previous studies, which have asked in general terms whether BIs are acceptable, cannot distinguish whether people support them because they want to change their own behavior or because they want other people's behavior to be changed. The studies cited above (Diepeveen, Reference Diepeveen, Ling, Suhrcke, Roland and Marteau2013; Bang et al., Reference Bang, Shu and Weber2018) suggest that support is at least partly driven by a desire to change other people's behavior. Therefore, in this study, we investigate how people evaluate BIs that are targeted at promoting their own health and wellbeing, asking them how acceptable it is for BIs to be used to change their own behavior.
We build on previous empirical work on the factors that affect the acceptability of BIs.
Previous work has consistently shown that people evaluate BIs more favorably when they are aware of the process that leads to behavioral change (Diepeveen et al., Reference Diepeveen, Ling, Suhrcke, Roland and Marteau2013; Felsen, et al., Reference Felsen, Castelo and Reiner2013; Jung & Mellers, Reference Jung and Mellers2016; Petrescu et al., Reference Petrescu, Hollands, Couturier, Ng and Marteau2016; Reisch & Sunstein, Reference Reisch and Sunstein2016; Reisch et al., Reference Reisch, Sunstein and Gwozdz2016; Sunstein, Reference Sunstein2016; Osman et al., Reference Osman, Fenton, Pilditch, Lagnado and Neil2018). They prefer transparent BIs, where they can identify the mechanism that is being used to influence their behavior, as opposed to opaque BIs, where they cannot identify the mechanism of behavioral change. We define transparency in terms of ease of identification of the mechanism underpinning the BI, which has been used by other researchers (e.g., Hansen & Jespersen, Reference Hansen and Jespersen2013; Bang et al., Reference Bang, Shu and Weber2018).Footnote 2 Another way of achieving transparency is via disclosure – telling people at the point of decision that BIs are being used to change their behavior. The two sorts of transparency are related because, as well as revealing the intended effect of the BI, full disclosure can include revealing the mechanism of behavior change. One explanation of the preference for transparency is that it enables people to maintain a sense of agency over the behavior being targeted by the BI (Osman, Reference Osman2014). Free choice is underpinned by a sense of agency and so, relative to opaque interventions, if people know how a behavior change is achieved, then they feel that they can more easily choose to do otherwise, thus preserving their autonomy (Lin et al., Reference Lin, Osman and Ashcroft2017; Osman et al., Reference Osman, Lin and Ashcroft2017).
Previous research shows that people trust BIs that are developed and proposed by researchers more than those that are developed and proposed by government (Osman et al., Reference Osman, Fenton, Pilditch, Lagnado and Neil2018). We also know that trust in government affects the acceptability of government interventions (Branson et al., Reference Branson, Duffy, Perry and Wellings2012), and it has been suggested that negative attitudes to BIs stem from mistrust in government (Jung & Mellors, Reference Jung and Mellers2016). In support of this conjecture, Bang et al. (Reference Bang, Shu and Weber2018) found that the acceptability of BIs depends on who designs and implements them (a friend being more acceptable than a government or corporate designer) and that these differences in acceptability were explained by perceived differences in the intention of the designer. Consistent with this story, Tannenbaum et al. (Reference Tannenbaum, Fox and Rogers2017) found that people's support for a BI depended on whether they were told that it had been enforced by a policy-maker they supported or one they opposed (the Bush administration versus the Obama administration).
Expectations about effectiveness matter
Previous research shows that the acceptability of government interventions and BIs strongly depends on their expected effectiveness (Pechey et al., Reference Pechey, Burge, Mentzakis, Suhrcke and Marteau2014; Petrescu et al., Reference Petrescu, Hollands, Couturier, Ng and Marteau2016) and that directly manipulating the effectiveness of BIs by quantifying the resulting change in behavior affects acceptability (Sunstein, Reference Sunstein2016; Arad & Rubinstein, Reference Arad and Rubinstein2018). Therefore, it is not surprising that giving people positive arguments – telling them that BIs are likely to be effective – affects their evaluations of the BIs. Sunstein (Reference Sunstein2016) found that, although people prefer transparent BIs to opaque ones, telling people that opaque BIs were more effective shifted their preferences toward opaque BIs by approximately 12% from baseline. However, to date, there has been no work investigating the impact of negative arguments – telling people about the possible backfire effects of the intervention – even though, outside of the laboratory, discussions about the effectiveness of BIs are more likely to be put in terms of general arguments for and against than to have precise quantifications attached.
In the present study, we compared the impacts on people's assessments of the acceptability of using BIs to change their own behavior of: the transparency of the BI (Transparent, Opaque); the designer of the BI (Researchers, Government, Advertisers); and three types of arguments regarding their efficacy (Positive, Positive + Negative, Negative). We tested the following hypotheses:
Hypothesis 1: Transparent BIs will be more acceptable than opaque BIs.
Hypothesis 2: The designer of the BI will affect the acceptability of the BI.
Hypothesis 3: The type of argument given will affect the acceptability of the BI.
Hypothesis 4: There will be an interaction effect between the designer of the BI and the type of argument given.
The rationale for Hypothesis 4 is that the more ambiguous the outcome of the BI, or the more salient the possible backfire effects, the more important the trustworthiness of the designer will be. Information about possible negative effects could cause people to doubt either the expertise or the intentions of the designer.
We expect that the transparency of a BI and its perceived likelihood of being effective are both factors that explain the acceptability of a BI and that these are mediated by a desire to change one's behavior through transparent and effective methods. Therefore, in order to discover the relative weight given to transparency and effectiveness and to test for mediation by desire to change behavior, we asked participants to rate the perceived transparency and effectiveness of each BI and their desire to have their behavior changed by that method. We also used the transparency and effectiveness ratings as manipulation checks.
In order to establish the generalizability of any results, we used five different contexts in which BIs have been implemented to promote health and wellbeing – exercise, diet, smoking, alcohol and finance – and compared acceptability across contexts. All of the interventions we showed participants were genuine interventions that have been implemented by policy-makers.
We used a mixed factorial design with one within-subject factor and three between-subject factors to give a 5 × 2 × 3 × 3 design. The within-subject manipulation was 5 different contexts in which a BI was implemented (Exercise, Diet, Smoking, Alcohol, Finance). The between-subject manipulations were: 2 Transparency of the BI (Transparent, Opaque) × 3 Designer of the BI (Researcher, Government, Advertiser) × 3 Argument about the likely effectiveness of the BI (Positive [Experiment 1a], Positive + Negative [Experiment 1b], Negative [Experiment 1c]). Experiments 1a, 1b and 1c were run serially (no one participated in more than one experiment); in each experiment, participants were randomly allocated to one of the 6 between-subject conditions.
For each context, there were four probative questions regarding BIs. After responding to the probes in all five contexts, participants were asked five demographic questions – about their age, sex, education, political affiliation and religion – and whether they were a smoker. At the end of the experiment, they were also asked some questions about their attitudes to BIs, including to indicate which contexts should not involve psychological methods designed to change behavior.
All experiments were presented via Qualtrics, which is an online platform for running experiments, and launched via Prolific Academic, a crowdsourcing system for participant recruitment of those who have university email addresses, including large pools in the USA and UK, both of which we used. All participants were financially compensated for their time (90 cents), calculated according to Prolific Academic's rates.
The Queen Mary University of London college ethics board granted ethical approval for the experiments under the project titled ‘Ethical concerns around nudges’ (QMERC2014/54).
Each experiment included US (total n = 872) and UK samples (total n = 843) (see Table 1). Although Experiments 1a, 1b and 1c were run serially, they drew from the same population and there were no differences in demographics (see Supplementary Appendix S1, available online, for details), so we have combined them for the analysis.Footnote 3 Participants who took much less or more time to complete the task than the allocated time (less than 8 or more than 30 minutes) were excluded. Four further participants failed an attention check question and were removed from the analysis.
After consenting to take part in the experiment, all participants were told that they were going to be asked questions about psychological methods that have been used to bring about behavior change and that “All of these methods are designed to help guide people to make the best decision for their own health and wellbeing.” For full instructions, see Supplementary Appendix S2.
Manipulation of designer
Participants were told the following
The [Top Advertising Company, Government, Top Researchers in Laboratories] in this country is/are using psychological research to help develop a set of simple methods that adjust the way information is presented, so that it can help people to make better decisions. The reason for using psychological methods is to help improve people's behavior, because in many day-to-day contexts people may not make a decision that is best for their own health, wellbeing and their happiness.
BIs and manipulation of transparency
Participants were then presented with a description of a BI. For each context, there were two BIs – one that was transparent and one that was opaque – all based on genuine interventions that have been implemented. Participants were randomly assigned to receive descriptions of either five transparent BIs or five opaque BIs. (The ten BIs are given in Table 2.) For each context, first, they were told what the context was in which the method would be used (e.g., ‘Smoking’), and what the Recommended Psychological Method was (e.g., “Design cigarette packaging so that it incorporates graphic pictures of damaged lungs and warnings such as ‘Smoking seriously harms you and others around you’, ‘Smoking harms your unborn baby’”). The order of presentation of the five contexts was randomized for each participant.
Arguments and manipulation of effectiveness
In Experiments 1a and 1b, participants were then presented with:
Argument for method to work: By highlighting the negative physical and moral issues concerning smoking, the negative experiences will become more obviously associated with smoking, and this will encourage smokers to reduce or even stop smoking.
In Experiments 1b and 1c, participants were presented with:
Argument for method NOT to work: By highlighting the negative physical and moral issues concerning smoking, the negative experiences will become so obviously associated with smoking that smokers will feel more defensive of their smoking habit, as a result, smokers will end up smoking more, meaning that the method will lead to increases in smoking.
A full list of the arguments for each context can be found in Table 3.
Explanation of transparency
Before the first probe, all participants were provided with the following definition of transparent and opaque BIs:
There are two types of psychological methods: transparent and non-transparent. A transparent psychological method works in such a way that anyone can easily identify the actual psychological method used to change their behavior, as well as easily identify how their behavior is changed by it. A non-transparent psychological method works in such a way that no one can identify the actual psychological method used to change their behavior, and no one can identify how their behavior is changed by it.
For each of the five BIs, all participants were required to respond to four questions concerning:
(1) Ease of identification
To what extent is it easy for you to identify HOW your behavior is going to be changed by the psychological method? (Scale 0 = I cannot easily identify how my behavior is changed by the psychological method to 100 = I can easily identify how my behavior is changed by the psychological method)
(2) Desire to change behavior
To what extent do you want to change your behavior through the psychological method in this particular situation? (Scale 1 = Not at all likely to 9 = Very likely)
To what extent do you think the psychological method used in this particular situation would positively change YOUR behavior? (Scale 1 = Not at all likely to 9 = Very likely)
To what extent do you think it is acceptable to use the psychological method described in this context to change your behavior? (Scale 1 = Unacceptable to 9 = Acceptable)
Once participants had responded to all four questions for each of the five scenarios, they were presented with five demographic questions about their age, sex, education level, political affiliation and religion, and asked whether they smoked or not (or preferred not to say).
In addition, we asked several exploratory questions, most of which we did not analyze since they do not bear directly on our hypotheses. Participants were asked: about the extent to which each BI would lead to positive changes in behavior in the population; whether they think that there are ethical issues concerning each BI and, if so, what they are; and some questions about how they value their health and wellbeing. Full instructions are available in Supplementary Appendix S2.
We have analyzed one of our exploratory questions because it was relevant given our results. At the end of the experiment, participants were asked to indicate which contexts should not involve psychological methods designed to change behavior, with the following possible responses (they could choose as many or as few as they wanted, so a participant may have chosen multiple options):
◻ Food and nutrition decisions concerning food, drink and nutritional intake
◻ Smoking decisions to quit smoking
◻ Alcohol decisions to reduce drinking
◻ Exercise decisions to increase levels of physical activity
◻ Financial decisions concerning investment
◻ Financial decisions concerning savings
◻ NONE of the five contexts should have psychological methods used to influence decision-making behavior
◻ I cannot decide
◻ All of the contexts should have psychological methods implemented to influence decision-making behavior
We analyzed a second of our exploratory questions at the request of an anonymous reviewer. For each of the five BIs, we asked:
Based on the psychological method used in this particular situation, to what extent do you think it would lead to positive changes in behavior IN THE POPULATION? (Scale 1 = Not at all likely to 9 = Very likely)
Manipulation check: Ease of Identification of behavior change method ratings
In order to test that our manipulation of transparency worked, we ran a four-way mixed multivariate analysis of variance (ANOVA) with the Ease of Identification ratings in the five contexts as the dependent variables, Context as the within-subject variable and the elements of the factorial design supplying the between-subject independent variables (including interaction effects).
Tests of between-subject effects showed that our manipulation affected Ease of Identification across all contexts combined. As we expected, given that we manipulated transparency, there was a small to medium main effect of Ease of Identification, F(1, 1697) = 163.70, p < 0.001, partial η2 = 0.88, with the Ease of Identification of Transparent BIs (M = 69.21, SE = 0.57) being higher than Opaque ones (M = 58.85, SE = 0.57). There were no other statistically significant effects. The full between-subjects model is given in Supplementary Appendix S3.
Multivariate tests showed that the difference in Ease of Identification between Opaque and Transparent BIs was also found in most of the individual contexts. There was a small to medium interaction effect between Ease of Identification and Transparency, F(4, 1694) = 38.31, p < 0.001, Wilks’ Λ = 0.114, partial η2 = 0.012. Post hoc pairwise comparisons revealed that the Transparent BIs all had higher Ease of Identification ratings than the Opaque ones in all contexts except Diet (all p < 0.001, except Diet, which was p = 0.144; see Figure 1). The full set of multivariate tests are given in Supplementary Appendix S3, as are the results of the less powerful within-subject tests, which show the same pattern of effects.
Manipulation check: Effectiveness ratings
In order to test whether giving different arguments affected the perceived effectiveness of the BIs, we ran a four-way mixed multivariate ANOVA with the Effectiveness ratings in the five contexts as the dependent variables, Context as the within-subject variable and the elements of the factorial design supplying the between-subject independent variables (including interaction effects).
The between-subjects tests showed that our manipulations failed to have the desired effect on Effectiveness across all contexts combined. There was only a very small main effect of Argument F(1, 1697) = 61.66, p = 0.006, partial η2 = 0.006, and a very small effect of Designer, F(1, 1697) = 3.66, p = 0.026, partial η2 = 0.004. Surprisingly, the largest main effect was the small main effect of Transparency, F(1, 1697) = 86.25, p < 0.001, partial η2 = 0.048, on Effectiveness ratings. There were no significant interaction effects. See Supplementary Appendix S3 for the full model.
Multivariate tests showed that the effect of Transparency on Effectiveness was also seen in the individual contexts. There was a medium-sized interaction effect between Effectiveness and Transparency, F(4, 1694) = 38.98, p < 0.001, Wilks’ Λ = 0.916, partial η2 = 0.084. Post hoc pairwise comparisons revealed that the Transparent BIs were rated as more likely to be effective than the Opaque ones in all contexts except Alcohol (all p < 0.001, except Diet, which was p = 0.02, and Alcohol, which was p = 0.267; see Figure 2). The full set of multivariate tests are given in Supplementary Appendix S3, as are the results of the less powerful within-subject tests, which showed the same pattern of effects.
Hypothesis testing: Acceptability ratings
We ran a four-way mixed multivariate ANOVA with the participants’ Acceptability ratings in the five contexts as the dependent variables, Context as the within-subject variable and the elements of the factorial design supplying the between-subject independent variables (including interaction effects). We used between-subject tests to examine our hypotheses across the five contexts combined and multivariate tests to investigate whether the results held in each context considered individually.
There was a medium-sized main effect of Transparency, F(1, 1697) = 248.08, p < 0.001, partial η2 = 0.128, with the acceptability of Transparent BIs (M = 6.96, SE = 0.05) being higher than Opaque BIs (M = 5.86, SE = 0.05). This supports Hypothesis 1 that Transparent BIs will be more acceptable than Opaque BIs.
There was a significant but negligible main effect of Designer F(1, 1697) = 3.60, p = 0.028, partial η2 = 0.004: Researchers M = 6.53, SE = 0.06; Advertisers M = 6.38, SE = 0.06; Government M = 6.31, SE = 0.06. This technically supports Hypothesis 2 that the designer of the BI will affect its acceptability, but the effect size is not meaningful.
There was a small main effect of Argument, F(2, 1697) = 10.52, p < 0.001, partial η2 = 0.012. Post hoc pairwise comparisons revealed that the effect of Argument was due to the mean acceptability being higher for Positive (M = 6.64, SE = 0.064) than for Positive + Negative (M = 6.33, SE = 0.058) and Negative (M = 6.26, SE = 0.058; both p < 0.001 and well under the Bonferroni-adjusted significance level of p = 0.017), but there was no significant difference between acceptability for Positive + Negative and Negative (p = 0.37). This supports Hypothesis 3 that the arguments will affect the acceptability of the BI.
There were no significant interaction effects, so Hypothesis 4 – that there will be an interaction effect between the designer of the BI and the type of argument – was not supported. See Supplementary Appendix S3 for the full model.
Multivariate tests, exploring our within-subject variable, showed the differential acceptability of BIs in the five contexts. There was a large main effect of Context on Acceptability, F(4, 1694) = 423.40, p < 0.001, Wilks’ Λ = 0.495, partial η2 = 0.505, suggesting that there were differences in Acceptability between at least one pair of contexts. Post hoc pairwise comparisons revealed that the means of the Acceptability ratings in each context differed (all p < 0.001, lower than the Bonferroni-adjusted significance level of p = 0.005), except for the means of Smoking and Alcohol (p = 0.71). The BIs were most acceptable in the context of Exercise (M = 7.28, SE = 0.048), followed by Diet (M = 6.97, SE = 0.046), Smoking (M = 6.55, SE = 0.051), Alcohol (M = 6.53, SE = 0.052) and Finance (M = 4.72, SE = 0.054).
There was a large interaction effect between Acceptability and Transparency, F(4, 1694) = 4122.94, p < 0.001, Wilks’ Λ = 0.777, partial η2 = 0.225. Post hoc pairwise comparisons revealed that the Transparent BIs were rated as more acceptable than the Opaque BIs in all contexts except Exercise (all p < 0.001, except Exercise, which, at p = 0.027, was more than the Bonferroni-adjusted significance level of p = 0.01; see Figure 3).
There was also a small interaction effect between Acceptability and Argument, F(8, 3388) = 4.44, p < 0.001, Wilks’ Λ = 0.979, partial η2 = 0.010, and a very small three-way interaction between Acceptability, Argument and Transparency, F(8, 3388) = 2.91, p < 0.003, Wilks’ Λ = 0.986, partial η2 = 0.007 (see Figure 4). Post hoc tests and means can be found in Supplementary Appendix S3.
There were no significant effects of the Designer of the BI (see Figure 5).
The results of within-subject effects tests confirmed the results of the multivariate tests. The full set of multivariate tests and within-subject tests can be found in Supplementary Appendix S3.
Predictors of Acceptability ratings
We ran regressions to discover the best predictors of Acceptability judgments in each context using standardized coefficients in order to be able to meaningfully compare effect sizes. For the Smoking context, we also ran a set of regressions that were limited to the participants who said they were smokers, since smokers are a small proportion of the population, less than 20% in the UK (ONS, 2018). The models are given in Table 4. For each context, Model 1 had Ease of Identification ratings as the sole predictor, Model 2 had Effectiveness as the sole predictor, Model 3 had Desire to Change Behavior as the sole predictor, Model 4 had both Ease of Identification and Effectiveness as predictors and Model 5 had all three of Ease of Identification, Effectiveness and Desire to Change Behavior as predictors.
There are clear patterns that held across the five contexts. All three ratings were significant predictors of Acceptability ratings when entered into the model separately (Models 1–3, Table 4), except that Ease of Identification was not a significant predictor amongst smokers for the acceptability of BIs for Smoking in any of the models.
Comparing the predictive power of Ease of Identification and Effectiveness by entering them both in Model 4, we can see that, across contexts, Effectiveness had a bigger effect on Acceptability than Ease of Identification (except for Exercise, where they were approximately equal with β = 0.230 for Ease of Identification and β = 0.228 for Effectiveness, both p < 0.001). The largest differences were found in the Finance context, where the coefficient on Effectiveness was more than four times larger than that on Ease of Identification (β = 0.140 for Ease of Identification, β = 0.597 for Effectiveness, both p < 0.001), and amongst smokers in the Smoking context (β = 0.003, p = 0.963 for Ease of Identification, β = 0.463, p < 0.001 for Effectiveness).
However, when we added Desire to Change Behavior into the models (Model 5), the coefficient on Effectiveness clearly decreased. In the case of Exercise, Effectiveness even became non-significant, β = 0.037, p = 0.295. This suggests that Desire to Change Behavior wholly mediates Effectiveness for Exercise and partially mediates Effectiveness in the other four contexts. We confirmed this by testing the remaining step for mediation (Baron & Kenny, Reference Baron and Kenny1986): regressing Effectiveness on Desire to Change Behavior. The models in Table 5 show that Effectiveness was a significant predictor of Desire to Change Behavior in all five contexts, confirming that Desire to Change Behavior partially mediated Effectiveness.
Multiple-choice question on the use of BIs in different contexts
BIs were clearly less acceptable in financial contexts, with 63.0% of participants saying that they should not be used for financial decisions involving investment and 53.4% saying that they should not be used for financial decisions involving savings. The next largest group was the 13.8% who could not decide, and all the other answers were chosen by less than 10% (see Figure 6).
Perceived effectiveness of BIs on self compared to effectiveness of BIs on the population
The BI was judged likely to be more effective on the population's behavior than on the participant's own behavior for four out of our five BIs, as shown by paired t-tests: Diet, population behavior M = 5.64, SD = 1.8, own behavior M = 5.47, SD = 2.23, t(1714) = 3.9, p < 0.001; Smoking, population behavior M = 5.24, SD = 1.93, own behavior M = 4.79, SD = 2.44, t(1714) = 8.7, p < 0.001; Alcohol, population behavior M = 4.27, SD = 1.95, own behavior M = 4.01, SD = 2.26, t(1714) = 5.8, p < 0.001; Finance, population behavior M = 4.20, SD = 2.07, own behavior M = 4.79, SD = 2.32, t(1714) = 10.4, p < 0.001. For Exercise, the difference was in the other direction, with participants judging the BI as less likely to affect population behavior (M = 4.78, SD = 2.00) than their own behavior (M = 5.02, SD = 2.38), t(1714) = –6.2, p < 0.001. We investigated this difference further by running multivariate ANOVAs for each domain with the Effectiveness on own behavior and on population behavior as the dependent variables (so own–population behavior was the within-subject variable) and Transparency as the between-subject dependent variable. For all five areas, there was a statistically significant interaction effect between Transparency and own–population behavior (all p < 0.001). For Diet, Smoking, Alcohol and Finance, there was an increased discrepancy in effectiveness on own behavior versus on population behavior in opaque BIs compared to the transparent BIs. However, for Exercise, the difference was the other way around: participants rated the transparent BI as more likely to be effective on their own behavior than on population behavior, and this discrepancy decreased in the opaque BI (see Table 6).
a These are the correct mean differences, even though they are not the differences between the numbers in the preceding columns; the discrepancies are caused by rounding to two decimal places.
General discussion and conclusions
We found that transparent BIs were more acceptable than opaque BIs (Hypothesis 1), and this result held across all contexts taken individually, except Exercise. The type of argument given affected the assessment of the BI, with BIs presented alongside positive arguments rated as more acceptable than those presented alongside negative or a mix of positive and negative arguments (Hypothesis 3), but this was a small effect, and the only two contexts that showed this effect individually were Alcohol and Finance. BIs that are implemented by researchers were judged as being slightly more acceptable than BIs implemented by governments (Hypothesis 2), but this was such a small effect that it is not meaningful. There was no interaction effect between the designer of the BI and the type of argument given, contra Hypothesis 4. On average, all of the BIs were considered acceptable for changing participants’ own behavior (with mean acceptability ratings above the mid-point of the scale), except for the opaque BI in the Finance context; there was differential acceptability of BIs across contexts, with Finance clearly least acceptable.
As well as finding transparent BIs more acceptable than opaque BIs, our participants regarded them as more likely to result in positive behavior change. Furthermore, the effectiveness of the BIs was at least as influential a predictor of acceptability ratings as the ease of identification of the behavior change mechanism across the five contexts (and considerably more influential in some, especially Finance and, interestingly, amongst smokers in the context of smoking cessation). There was a direct effect of ease of identification on acceptability – except, notably, for smokers when asked about BIs that discourage smoking – which we had expected given H1. This is consistent with arguments that people care about having a sense of agency over their actions (Osman, Reference Osman2014) and past findings that people view opaque BIs as more autonomy-threatening than transparent ones (Jung & Mellors, Reference Jung and Mellers2016). However, the likelihood that the BI would result in positive behavior change had more predictive power than Ease of Identification in all contexts except Exercise. Bang et al. (Reference Bang, Shu and Weber2018) found mixed results on this point, with their study 1 finding no relationship between effectiveness for self and acceptability, but their study 2 finding that the expected effectiveness of a change in choice architecture on one's own behavior predicted the acceptability of the change. Our results are consistent with those of their study 2. Our finding that people predict that the BI will be more likely to be effective on the population as a whole than on their own behavior is also consistent with the Bang et al. (Reference Bang, Shu and Weber2018) finding that BIs will be more effective for others than for themselves.
It is not surprising that people care about both transparency and effectiveness – this is an obvious prediction that is also consistent with previous results (e.g., Sunstein, Reference Sunstein2016; Arad & Rubinstein, Reference Arad and Rubinstein2018). However, the relative importance of effectiveness is at odds with the theoretical focus on the acceptability of transparency. For example, a parliamentary report in the UK identified the extent to which a BI is covert as one of two criteria that should bear on its acceptability (House of Lords, Science and Technology Select Committee, 2011). (The other being the extent to which the BI is popular with the public.) Our results are consistent with another survey study whose authors also drew conclusions from average ratings: Petrescu et al. (Reference Petrescu, Hollands, Couturier, Ng and Marteau2016) tested the hypothesis that stating that interventions work via non-conscious processes decreases their acceptability. They found no evidence to support the hypothesis, but they did find that the effectiveness of the BI was a predictor of acceptability (Petrescu et al., Reference Petrescu, Hollands, Couturier, Ng and Marteau2016). The authors of a qualitative study also reported that interviewees had very limited concerns regarding the manipulative aspects of BIs (Junghans et al., Reference Junghans, Cheung and De Ridder2015). It is possible that transparency is a strong concern for a minority of people; for instance, Arad and Rubinstein (Reference Arad and Rubinstein2018) found that a minority of their subjects reported an opposition to BIs, and this was driven by concerns about manipulation and the fear of a ‘slippery slope’ to non-consensual interventions. It is also possible that the use of survey methods decreases the impact of transparency, and if we had conducted a vignette study, then transparency would have been a more influential predictor of acceptability, since getting participants to imagine being in the situation would have simulated the feeling of being manipulated.
Our finding that people rated transparent BIs as more effective than opaque BIs is also surprising. In academic debates on the acceptability of BIs, it has been assumed that transparency and effectiveness pull in different directions (Bovens, Reference Bovens, Grüne-Yanoff and Hansson2009; House of Lords, Science and Technology Select Committee, 2011). Bovens (Reference Bovens, Grüne-Yanoff and Hansson2009, pp. 209, 217) says that these techniques “work best in the dark.” However, the participants in our sample did not seem to agree with that. (Ditto the participants of Jung & Mellers, Reference Jung and Mellers2016, who found that transparent, System 2 BIs were viewed as more effective for changing behavior.) It could be that being able to identify a mechanism made it seem more likely to our participants that the BI would be effective, or there could be a halo effect whereby a more acceptable BI is generally judged to have more of other desirable properties as well. However, it seems that our participants’ folk psychology is right, since there are now several studies showing that disclosure does not affect effectiveness, most of which concerned defaults (Loewenstein et al., Reference Loewenstein, Bryce, Hagmann and Rajpal2015; Steffel et al., Reference Steffel, Williams and Pogacar2016; Bruns et al., Reference Bruns, Kantorowicz-Reznichenko, Klement, Jonsson and Rahali2018), but one of which concerned the placement of food items in a snack shop (Kroese et al., Reference Kroese, Marchiori and de Ridder2015).
In our study, effectiveness was partially mediated by the desire to change behavior through the BI. In other words, when people believed that a BI would be effective, then they wanted to use it to change their behavior. This supports the contention that people do want to achieve positive behavior change and they support BIs that will help them to do that; there is a sense in which people find BIs acceptable because the interventions will make them better off as judged by themselves.
We found a general lack of support for Hypotheses 2–4. There was only a very small effect of Argument, a negligible effect of Designer and there was no evidence of the predicted interaction effect between Argument and Designer. The low impact of Argument (and lack of interaction effect) is less surprising when we consider that the Argument manipulation did not have much impact on effectiveness ratings. The lack of substantial impact of the designer of the BI is more surprising given that scientists are more trusted than governments: an Ipsos Mori (2018) survey found that 85% of British adults trust scientists to tell the truth compared to 19% for politicians and 16% for advertising executives, and a previous study showed that people trust BIs that are developed and proposed by scientists more than those that are developed and proposed by governments (Osman et al., Reference Osman, Fenton, Pilditch, Lagnado and Neil2018). Osman et al. (Reference Osman, Fenton, Pilditch, Lagnado and Neil2018) explained their result with reference to research on ‘source credibility’ in the psychology of communication, which is the idea that people are more receptive when there is a good fit between the area of expertise of the communicator (in this case, the proposer of the BI) and the topic of communication (in this case, the BI being proposed); for a review of the literature on source credibility, see Pornpitakpan (Reference Pornpitakpan2004). It may be that our subjects did not see any differential in expertise, since we stressed that the advertisers and researchers were ‘top’ of their field, and there were only negligible effects of the designer on ratings of whether the BI would positively change behavior. The idea that we would see an interaction effect between Designer and Argument was predicated on creating ambiguity, but as well as being unsuccessful at creating ambiguity about the likely effectiveness of the BI, we probably did not create ambiguity around the intention of the designer, since we had also stated that the aim of the BI was to promote positive behavior change. If participants thought that all three designers were equally effective and well-intentioned, then there would be no reason for there to be an effect of Designer on their judgments.
We found that using BIs in Finance was less acceptable than using BIs in other contexts. The mean acceptability of the health-related BIs ranged from 7.28 to 6.53, while the mean acceptability of Finance-related BIs was only 4.72. Our results in the four health-related contexts are consistent with evidence that people approve of BIs for health behavior (Junghans et al., Reference Junghans, Cheung and De Ridder2015; Reisch et al., Reference Reisch, Sunstein and Gwozdz2017). We can only speculate about why our Finance-related BIs were less acceptable, since we had only a single pair of transparent and opaque BIs in each context, and the BIs were not matched (matching was not possible given that we used BIs that had actually been implemented). The opaque Finance-related BI was also the only BI that used a default. However, we do not think that it was the default alone that caused the low ratings, since there are field experiments whose results show that people approve of having their own behavior changed by BIs using defaults. For instance, after being exposed to a BI that presented the vegetarian meal as a default when registering for a conference – increasing the number choosing the vegetarian meal from 13% to 89% – 90% of those exposed said that they approved of changing the default (Hansen et al., Reference Hansen, Malthesen and Schilling2019). In addition, after an intervention that changed the default positions of sit–stand desks in a workplace to the standing position, increasing the rate of standing from 1.8% to 13.1%, 56.5% of employees said that it was acceptable to be unconsciously influenced in this way (Venema et al., Reference Venema, Kroese and De Ridder2018).Footnote 4
In our multiple-choice follow-up question, we found that Finance was a clear outlier, with the majority of participants saying that psychological methods should not be used in that context. We suspect that there are features that differentiate health from finance, which made the finance-related BIs less acceptable. In health, everyone can agree that, for example, high-sugar and high-fat products are unhealthy. However, in finance, the best product for someone depends on their attitudes to risk. Even though a traffic-light rating marks the riskiest products as red, those may be the most appropriate products for some people; ditto a default product may not be the best choice for everyone.Footnote 5 Therefore, people may be more skeptical that BIs in finance will actually promote their wellbeing. In our study, people considered that BIs were least likely to have a positive effect in Finance (in fact, with a mean of 3.84, the rating of BIs in Finance was on the non-effective side of the scale), and the differential predictive power of Effectiveness and Ease of Identification on Acceptability ratings was particularly striking for Finance. In the regression models, the coefficient on Effectiveness was higher for Finance than for any other context. So our participants had a high level of concern regarding the Effectiveness of the Finance BI, but did not think that it would have positive effects.
Another reason why participants may have been dubious about the effectiveness of the Finance BI is that they might have been worried about whether the default would benefit them if it was influenced by industry, as the bank might wish to default them into an option that would be profitable for the bank. Since the financial crisis, attitudes to the financial services industry have become more negative (Bennett & Kottasz, Reference Bennett and Kottasz2012), and some authors have concluded that there is now a crisis of trust in that sector (Bachmann et al., Reference Bachmann, Gillespie and Kramer2011; Sapienza, & Zingales, Reference Sapienza and Zingales2012). This lack of trust may also help explain why people tend to think that finance-related BIs are less likely to lead to positive changes than health-related interventions, and why they find finance-related BIs less acceptable. This is ironic because financial literacy around retirement saving and pension plans is low (Lusardi & Mitchell, Reference Lusardi and Mitchell2011a, Reference Lusardi and Mitchell2001b), suggesting that there is scope for using BIs to improve outcomes in this area.
To view supplementary material for this article, please visit https://doi.org/10.1017/10.1017/bpp.2020.6
This work was supported by Natalie Gold's funding from the European Research Council, under the European Union's Seventh Framework Programme (FP/2007-2013)/ERC Grant Agreement no. 283849. This research was also funded by Queen Mary University of London Life Sciences Initiative Studentship (LSIPGRS).