Incentives and payment mechanisms in preference elicitation

Andreas C. Drichoutis; Marco A. Palma; Paul J. Feldman

doi:10.1017/eec.2026.10042

Incentives and payment mechanisms in preference elicitation

Published online by Cambridge University Press: 23 March 2026

Andreas C. Drichoutis

Marco A. Palma

and

Paul J. Feldman

Show author details

Andreas C. Drichoutis*: Affiliation:
Department of Agricultural Economics & Rural Development, School of Applied Economics and Social Sciences, Agricultural University of Athens, Athens, Greece
Marco A. Palma: Affiliation:
Human Behavior Laboratory, Department of Agricultural Economics, Institute for Advancing Health through Agriculture, Texas A&M University, College Station, TX, USA
Paul J. Feldman: Affiliation:
Human Behavior Laboratory, Department of Agricultural Economics, Institute for Advancing Health through Agriculture, Texas A&M University, College Station, TX, USA
*: Corresponding author: Andreas C. Drichoutis; Email: adrihout@aua.gr

Article contents

Abstract
Introduction
Related literature
Experiment 1: preference elicitation with the BDM mechanism
Experiment 2: preference elicitation with the Second Price Auction
Discussion and conclusions
Supplementary material
Footnotes
References

Rights & Permissions

Abstract

Theoretical insights dominate the literature examining the incentive compatibility of payment mechanisms. Despite their elegance, theoretical insights are rarely empirically validated. We fill this gap by empirically exploring the effects of frequently used payment mechanisms using a collective sample of over 3000 participants across two experiments. In Experiment 1, we obtained offer prices to sell a card, systematically varying between-subjects the way subjects received payments over repeated rounds, by either paying for all decisions (and various modifications) or just one, as well as making the payments certain, probabilistic, or purely hypothetical. While we find that the magnitude of the induced value and the range of the prices used to draw a random price significantly affect misbidding behavior, neither the payment mechanism nor the certainty of payment affected misbidding. In Experiment 2, we replaced the BDM mechanism with a Second Price-Auction and found similar results, albeit lower rates of misbidding behavior. Overall, our empirical exercise shows that theoretically relevant elements do not produce empirical differences, while design choices that are theoretically irrelevant produce empirical differences. As such, payment mechanism design considerations should carefully consider the choice architecture in addition to incentive compatibility.

Keywords

Becker–DeGroot–Marschak mechanism choice architecture preference elicitation second price auction C80 C91 C99 D44

Information

Type: Special Issue Article
Information: Experimental Economics , First View , pp. 1 - 17

DOI: https://doi.org/10.1017/eec.2026.10042 [Opens in a new window]
Creative Commons: This is an Open Access article, distributed under the terms of the Creative Commons Attribution licence (http://creativecommons.org/licenses/by/4.0), which permits unrestricted re-use, distribution and reproduction, provided the original article is properly cited.
Copyright: © The Author(s), 2026. Published by Cambridge University Press on behalf of the Economic Science Association.

1. Introduction

Incentives have to matter. This imperative is a core tenet of economics. Insufficient or inadequate incentives will lead to deviations in the behavior predicted by economic and behavioral models. Economic experiments often involve multiple decisions of the same task, or they incorporate multiple tasks within the same study. Correspondingly, researchers must weigh the potential tradeoffs of different payment mechanisms, the size of incentives, potential payoff externalities, and budget constraints. Paying for every decision increases costs and may induce portfolio and wealth effects. While these effects can be mitigated by only paying for one random choice, paying for one random choice may dilute incentives as the number of choices increases (Beattie & Loomes, Reference Beattie and Loomes1997; Charness et al., Reference Charness, Gneezy and Halladay2016). To complicate matters further, random incentive schemes may fail to be incentive-compatible, exhibit menu dependence, and induce risk preferences even in purely deterministic settings. This shortcoming of random incentives has led some researchers to argue for collecting preferences only over a single choice (e.g., Cox et al., Reference Cox, Sadiraj and Schmidt2015; Harrison & Swarthout, Reference Harrison and Swarthout2014), despite the restrictive nature of this approach. Overall, the first-best incentive scheme is rarely transparent, making it difficult to develop general guidelines.

The lack of empirical interest, even from experimental economists (Azrieli et al., Reference Azrieli, Chambers and Healy2018), suggests that incentive compatibility concerns may be overstated. More troubling, perhaps, is the fact that the incentive compatibility of different mechanisms cannot be established without particularly strong assumptions on preferences. For example, most theoretical work focuses on menu-independent binary preferences. Paradoxically, these preferences may not exist if agents have non-consequentialist preferences, e.g., non-expected utility preferences (Machina, Reference Machina1989). Moreover, incentive compatibility has little explanatory power for plausible mistakes or heuristics. How, then, can one elicit true preferences and discriminate among existing preference incentive mechanisms?

Our study employs the simplest application of induced values (Smith, Reference Smith1976), providing a straightforward empirical testing approach: the monetary value of money is known. Consequently, for all of our extensive experimental treatments, the objective preferences over monetary amounts are clearly defined in terms of their monetary equivalents. Thus, we can focus primarily on the proper empirical incentive scheme. That is, on the effectiveness of different incentive schemes to recover correct individual preferences from the (induced) preferences. In a nutshell, the known monetary value of $\$2$ is $\$2$, as used by Cason and Plott (Reference Cason and Plott2014).

In a sample of these valuation tasks, we consider, broadly, the following three dimensions of incentive mechanisms: 1) the size of the prize, 2) the chance with which payment is determined, i.e., the “incentive scheme,” and 3) the chance each choice counts toward total earnings, i.e., the “payment mechanism.” In the valuation tasks, we also vary the range of values people can assign to a certain monetary amount and whether the uncertainty determining earnings is strategic.

In a collective sample of over 3,000 subjects, we find that subjects are sensitive to changes in prizes and that the problem’s framing matters, while the incentive scheme and the payment mechanism empirically do not. Participants exhibit greater sensitivity to monetary rewards, as reflected in better-calibrated valuations for higher rewards, while broader value ranges (greater opportunity for deviations) lead to higher deviations from the objective monetary value of a dollar. Strikingly, the lack of variation according to the incentive scheme extends to cases where the rewards are hypothetical, adding to the literature that finds minimal or no differences between real and hypothetical stakes (e.g., Brañas Garza et al., Reference Brañas Garza, Jorrat and Espín2023; Enke et al., Reference Enke, Gneezy, Hall, Martin, Nelidov, Offerman and van de Ven2023; Gneezy et al., Reference Gneezy, Imas and List2015; Hackethal et al., Reference Hackethal, Kirchler, Laudenbach, Razen and Weber2023; Li et al., Reference Li, Müller, Wakker and Wang2017; Irwin et al., Reference Irwin, McClelland and Schulze1992).Footnote ¹ One reason why the incentive scheme may not yield any meaningful effects is that the evaluated tasks (preference elicitations) are not cognitively demanding, or that higher cognitive effort may not lead to a meaningful difference between hypothetical and real behavior. We also find that strategic uncertainty produces better-calibrated valuations; however, different incentive schemes again play no meaningful role in overall bidding behavior. Although misunderstanding may be a factor explaining misbehavior (Serizawa et al., Reference Serizawa, Shimada and Tse2024), we doubt that it is driving our results, given the strict measures taken to ensure that subjects were attentive to the instructions and understood the procedures.

The rest of the paper proceeds as follows. The next section reviews the relevant literature to set the context and motivate our study. We then present the two experiments sequentially by describing the methods and summarizing the results from each experiment. We conclude in the final section.

2. Related literature

This section briefly reviews some of the existing literature on the effect of payment mechanisms on experimental auctions and Between-Subject Random Incentive Schemes (BRIS). We highlight key studies and findings in the field, revealing the complexities and debates surrounding effective incentive design, the role of cash balances, and the effectiveness of different incentivization strategies. This overview offers insights into how experimental setups and incentive mechanisms can significantly affect economic behavior and decision-making processes.

Early studies in the auction literature investigating the winner’s curse sparked debates due to the use of mechanisms that involved paying for all decisions across multiple periods (Kagel & Levin, Reference Kagel and Levin1986). Hansen and Lott (Reference Hansen and Lott1991) argued for the rationality of subject bids above the auctioned item’s conditional expected value at the theoretical bidding equilibrium observed in Kagel and Levin (Reference Kagel and Levin1986) as a response to limited liability and low cash balances (i.e., accumulated earnings from paying for multiple rounds).Footnote ²

In response, Kagel and Levin (Reference Kagel and Levin1991) conducted a follow-up experiment that ensured subjects had sufficient cash balances so that deviations from the predicted (risk-neutral) Nash equilibrium could not be explained by the limited liability arguments and still obtained significant overbidding. Ham et al. (Reference Ham, Kagel and Lehrer2005) argued that cash balances may also affect bidding behavior in private value auctions, and to address this concern, they introduced exogenous variation in cash balances by randomly assigning additional payments while subjects bid in a first price auction. They found that cash balances also play a statistically significant role in bidding behavior in private value auctions.

While cash balance incentives can be avoided by paying for only one randomly selected trial, Ham et al. (Reference Ham, Kagel and Lehrer2005) further noted their impact on subjects’ incentives, potentially diluting payoffs in two ways. First, expected payoffs can be a function of the compounded probability of a trial being selected multiplied by the payoffs for that trial, which may dilute incentives with an increased number of trials and/or smaller payoffs per trial. Second, since there is only one bidder with earnings in many auction formats, as in a first or a second price auction, effective recruitment of subjects can only be achieved with large fixed show-up fees, which may render the incentives associated with the auctions trivial.Footnote ³

Also relevant to our study is a strand of the literature that has focused on Between-Subject Random Incentive Schemes (BRIS) or lottery incentives, where only a subset of subjects are randomly selected to realize their decisions and receive a payment. BRIS has been investigated in several domains including fairness (Bolle, Reference Bolle1990), preferences for risk and ambiguity (Anderson et al., Reference Anderson, Freeborn and McAlvanah2023; Aydogan et al., Reference Aydogan, Berger and Théroude2024; Baltussen et al., Reference Baltussen, Post, van-den Assem and Wakker2012; Berlin et al., Reference Berlin, Kemel, Lenglin and Nebout2026; March et al., Reference March, Ziegelmeyer, Greiner and Cyranek2016), time preferences (Berlin et al., Reference Berlin, Kemel, Lenglin and Nebout2026) and donations in dictator games (Clot et al., Reference Clot, Grolleau and Ibanez2018). More recently, Ahles et al. (Reference Ahles, Palma and Drichoutis2024) found that a 10% and 1% payment probabilities are effective in eliciting valuations that are statistically indistinguishable from a fully incentivized scheme and that all incentivized conditions can mitigate hypothetical bias, resulting in lower elicited valuations than a purely hypothetical condition. Ahles et al. (Reference Ahles, Palma and Drichoutis2024) is likely the first systematic work on BRIS in valuation research, serving as the foundation for subsequent studies that have built upon and applied their methods in valuation settings (e.g., Bó et al., Reference Bó, Chen and Hakimov2024; Hosni et al., Reference Hosni, Segovia and Zhao2024; Mustapa et al., Reference Mustapa, Kallas, López-Mas, Alamprese, Contiero and Aguiló-Aguayo2025; Veettil et al., Reference Veettil, Yashodha and Vecci2025).

3. Experiment 1: preference elicitation with the BDM mechanism

The following section first outlines the experimental design and implementation details, including subject recruitment, instructions, task structure, and treatment arms. We then present the main empirical results.

3.1. Methods and experimental design

This study and the subsequent study described in the next section were preregistered with the AEA’s RCT registry (AEARCTR-0009687). Subjects were panelists from Forthright Access, an online research company that handles its own recruitment through various direct advertising channels. Participants were offered a $2.5 reward for a 20-minute study. Subjects were informed they could also gain additional rewards after entering the study. We employed several quality controls to ensure subjects’ attention and comprehension based on a pilot study with 78 subjects (Haaland et al., Reference Haaland, Roth and Wohlfart2023).Footnote ⁴

Experiment 1 involved eliciting preferences over an induced value (IV) using the BDM mechanism (Becker et al., Reference Becker, DeGroot and Marschak1964). As in Cason and Plott (Reference Cason and Plott2014), subjects were endowed with a card worth a known IV and were asked to state their offer price to sell the card back to the experimenter.Footnote ⁵ Subjects were informed that their offer price would be compared to a fixed offer that would be randomly drawn from the interval of $[0,X]$ where $X$ was varied from task to task.Footnote ⁶ We varied the IV at a low and a high level of $1 and $3 and varied the maximum bid range, $X$, at $4, $5, and $6. Consequently, each subject participated in six tasks; all possible combinations of the IV and the upper level of the support of the distribution, $X$. The order of the six preference-elicitation tasks was randomized across subjects.

The instructions included several examples detailing the BDM mechanism, followed by a series of true/false and open-ended comprehension questions. Although we provided detailed instructions and examples to ensure that participants understood the mechanisms in the study, we did not instruct participants on what to bid or explicitly guide their behavior.Footnote ⁷ After screening out inattentive subjects, the sample included 2,575 subjects.Footnote ⁸ In addition to the participation fee, subjects analyzed in this paper earned an average of $2.67 (min=$0, max=$29.4, SD=$5.82).

Our experimental design also varied on two between-subjects dimensions: incentive schemes and payment mechanisms.Footnote ⁹ To test for potential diluted incentives, we had three different likelihoods of decisions being paid in the incentive scheme: subjects either had a 100% chance of receiving monetary rewards associated with their decisions, a 50% chance, or a 1% chance. After collecting data for these treatments, we found a null effect of differences between treatments, so we decided to run two additional boundary conditions: a 0.2% chance treatment of getting monetary rewards and a purely hypothetical treatment. Thus, the incentive scheme had five distinct probabilities of payment for subjects. Every subject was given information about the probability of their decisions being paid in two different screens: one at the beginning of the study and one right before eliciting their preferences with a BDM mechanism. For the hypothetical treatment, subjects were informed multiple times at different points of the study that although monetary rewards would be shown in various screens, they would only receive a fixed compensation and that none of the stated monetary amounts would count toward their earnings. The text in the instructions was modified appropriately for the corresponding treatments, varying the probability of payments. The instruction scripts appear in the Online Appendix.

The second dimension, the payment mechanism, varied the number of paid decisions, the correlation between those payments, and whether we adjusted the magnitudes according to the number of paid decisions. Our base payment is the Pay-One-Randomly (POR) mechanism, where only one of the six tasks is randomly selected for payment. We compare the POR mechanism with four additional payment mechanism previously used by Cox et al. (Reference Cox, Sadiraj and Schmidt2015) (in an application of decisions under risk): (a) the Pay-All-Correlated (PAC) mechanism, (b) the PAC mechanism adjusted for the number of tasks (PACn), (c) the Pay-All-Independently (PAI) mechanism and (d) the PAI mechanism adjusted for the number of tasks (PAIn).Footnote ¹⁰ For the PAC mechanism, subjects were paid for all six preference elicitation tasks, with the fixed offer determined with a single draw for all tasks as follows: a random percentage would be drawn between 0% and 100%, and the drawn percentage would be multiplied by the upper support of the distribution of allowed offers which would determine the fixed offer per task, albeit with a single draw. An arithmetic example illustrated this mechanism for subjects. The PACn mechanism was explained in a similar fashion, albeit subjects were aware that they would receive one-sixth of the total payoffs (therefore, the payoffs were divided by the number of tasks).

In the PAI mechanism, subjects received an independent draw per task as follows: subjects were informed that the computer would choose a random percentage for each task that would be multiplied by the upper support of the distribution of allowed offers, which would determine a different fixed offer per task. An arithmetic example illustrated this mechanism for subjects. The PAIn mechanism was explained in a similar way to PAI with the difference that subjects were told they would receive one-sixth of the total payoffs.

Table 1 summarizes the experimental design and the number of subjects assigned to each treatment arm. Our target of 100 subjects/treatment is large enough to detect minimum differences in absolute bid deviations ( $|bid-IV|/IV$) of 0.05 or larger with 80% power. Sample size calculations, instructions, examples, and final payoff screens are provided in the Online Appendix, which can also be consulted at the Open Science Framework.

Table 1.

Experimental design and number of subjects per treatment

Notes: PAC, PAI, and POR stand for pay-all-correlated, pay-all-independently, and pay-one-randomly, respectively. n indicates the sum of payoffs is divided by the number of tasks.

3.2. Experiment 1 results

Figure 1 shows CDFs of bid deviations from the IV (Panel a) and relative absolute deviations from the IV (Panel b).Footnote ¹¹ It is clear that greater misbidding occurs for the lower IV as the respective CDF is shifted more to the right.Footnote ¹² Figure 1(a) also shows that overbidding is more prevalent than underbidding. Only 15.50% of all bids are exactly equal to the IV and 24.71% (30.89%) of all bids are within 5% (10%) of the IV. Cason and Plott (Reference Cason and Plott2014) report that without training 16.7% of subjects have bids within 5 cents (2.5%) of their induced value of $\$2$, which is similar to our findings. Moreover, Brown et al. (Reference Brown, Liu and Tsoi2025) find similar patterns of misbidding that are fairly constant across various elicitation formats that are strategically equivalent but cognitively simpler than the BDM mechanism.

Fig. 1

CDFs of bid deviations from IV (BDM). (a) Bid deviations from IV. (b) Relative absolute bid deviations from IV

Table 2 shows descriptive statistics (mean, standard deviation, median) for the relative absolute deviations by incentive scheme and payment mechanism. The values of the deviations in this table are remarkably stable across treatments at around 0.5, supporting a statistically insignificant effect of both incentives schemes and payment mechanisms.Footnote ¹³

Table 2.

Descriptive statistics of $|Bid-IV|/IV$ by payment mechanism and incentive scheme

Notes: This table shows means, standard deviations in parenthesis and medians in brackets, pooled across the six decisions. PAC, PAI, and POR stand for pay-all-correlated, pay-all-independently, and pay-one-randomly, respectively; n indicates the sum of payoffs is divided by the number of tasks.

Table 3 shows estimates from regression models with clustered standard errors at the individual level, using either bid deviations ( $Bid - IV$) or relative absolute deviations ( $|Bid-IV|/IV$) as the dependent variable and the treatment indicators as independent variables.Footnote ¹⁴

Table 3.

Regressions of bid deviations on treatment variables

Notes: Clustered standard errors in parentheses.

* p $ \lt $0.1, **p $ \lt $0.05 and ***p $ \lt $0.01. Base categories for the treatment variables are: IV = 1 & Support = 4, 100% & POR. PAC, PAI, and POR stand for pay-all-correlated, pay-all-independently, and pay-one-randomly, respectively. n indicates the sum of payoffs is divided by the number of tasks.

As shown in Table 3, relative to the 100% incentives scheme in the POR baseline payment mechanism, none of the incentives schemes nor the payment mechanisms significantly affect misbidding behavior. None of the coefficients is statistically different from the baseline. On the other hand, both the IV and the support level of the distribution affect deviations from the induced value. More specifically, the upper panel of Table 3 shows that a larger induced value reduces deviations from the IV and that this reduction is moderated by the level of the support of the distribution. For the lower IV of $1, a larger support increases relative absolute misbidding by 0.17 to 0.39. The larger IV of $3 reduces misbidding behavior but with a larger support this reduction shrinks. For example, model (1) shows that misbidding declines by 0.76 for a $4 support, but it is only reduced by 0.31 for the larger level of support of $6.Footnote ¹⁵

Additional analysis in Section B in the Online Appendix estimates ordered logit models by transforming the dependent variable to categories (under/over bids and bids equal to the IV). Results are similar to what was discussed above.

4. Experiment 2: preference elicitation with the Second Price Auction

To test whether the preference elicitation mechanism has an effect on elicited preferences, in Experiment 2, we replaced the BDM mechanism with the Second Price Auction (SPA). Both BDM and SPA are theoretically incentive compatible, but the SPA features strategic uncertainty, as opposed to personal or objective uncertainty, as the uncertainty arises from other bidders’ strategy rather than a randomly drawn price. Although this should not influence (equilibrium) behavior, replacing the BDM with an SPA allows us to empirically test whether the source of the uncertainty matters. Varying the source of uncertainty helps us understand whether the randomization devices being employed may be driving our results. We reduced the treatment arms of the experimental design of Experiment 1 to fit budget constraints and selected to test a subset of treatments that are most widely used and provide boundary conditions, since they may be more likely to affect bidding behavior. With respect to the incentive schemes, we administered a purely hypothetical treatment and a treatment that pays with 100% certainty. With respect to the payment mechanisms, we selected the POR and the Pay-All divided by the number of rounds (PAn) in order to keep incentives comparable.Footnote ¹⁶ In summary, we implement a 2 $\times$2 between-subjects design in Experiment 2.

4.1. Methods and experimental design

Subjects were panelists from Forthright Access, none of whom had participated in Experiment 1. We offered a $2 reward for a 15-minute study. Subjects that were not assigned to a hypothetical treatment were informed they could also earn additional rewards after entering the study.

We implemented the same quality controls as in Experiment 1. One particular feature of this experiment is that recruitment was done on a limited time window within a day to ensure a large pool of participants entered simultaneously and achieve good matching of participants to the auction groups. Four subjects would form an auction group, but if more than 3.5 minutes elapsed without fulfilling a group, then we used bots to complement a group.Footnote ¹⁷ The main results present responses from subjects who were matched in groups of humans only. The results that include subjects matched with bots are presented in the Online Appendix and are similar to the humans-only sample. Subjects were informed about the number of bots they were matched with, if any. Furthermore, when we control for the number of bots in the regressions shown in the Online Appendix, we find that the inclusion of bidding bots does not significantly affect the results. All subjects in a group were assigned to the same treatment for the entire experiment.Footnote ¹⁸

The final sample with complete responses includes 637 subjects, albeit 209 of them were matched with one or more bots. On top of their participation fee, subjects received an average of $1.06 (min=$0, max=$3, SD=$1.12). Table 4 shows the number of subjects per treatment. In the main regressions, we only use observations from subjects that were not matched to bots. Still, we controlled for the number of bots in additional specifications shown in the Online Appendix, and all of our results hold.

Table 4.

Experimental design, number of subjects, and number of bots per treatment

Notes: PA and POR stand for pay-all and pay-one-randomly, respectively; n indicates the sum of payoffs is divided by the number of tasks.

Similar to Experiment 1, subjects were endowed with a card worth a known IV and were asked to state their offer price to sell the card back to the experimenter with the understanding that they were assigned to a group of four subjects, their offer is compared to all other offers, and the lowest offer is accepted, but that the second lowest offer is the binding price. Subjects experienced four different IVs that were selected to be in the same range as in Experiment 1: $1, $1.7, $2.4, and $3. Subjects experienced all the IVs in a random order, and at any given round, only one subject was assigned to each IV so that all four IVs were assigned at any round.

Before participating in the SPA, all subjects went through similar instructions, comprehension questions, and quality checks as in Experiment 1. All experimental instructions, test questions, and attention check questions are provided in the Online Appendix, which is also available at the Open Science Framework.

4.2. Experiment 2 results

Figure 2 shows CDFs of bid deviations from IV (Panel a) and relative absolute deviations from IV (Panel b) for two of the IVs.Footnote ¹⁹ We purposefully keep the scale of the x-axis similar to Figure 1 to facilitate the visualization of the differences to the BDM mechanism in Experiment 1. The results show evidence that the SPA leads to less misbidding than the BDM and that a larger IV reduces misbidding. In the SPA, 19.98% of all bids are exactly equal to the IV, and 27.45% (42.93%) of all bids are within 5% (10%) of the IV. This is a substantial improvement compared to the BDM mechanism in Experiment 1.

Fig. 2

CDFs of bid deviations from IV (SPA). (a) Bid deviations from IV. (b) Relative absolute bid deviations from IV

Table 5 shows estimates from regression models with clustered standard errors at the individual level, where we regressed either bid deviations ( $Bid - IV$) or relative absolute deviations ( $|Bid-IV|/IV$) on the treatments dummies. The sample is restricted to subjects that were not matched with a bot for the SPA.Footnote ²⁰

Table 5.

Regressions of bid deviations on treatment variables for the SPA

Notes: Clustered standard errors in parentheses.

* p $ \lt $0.1, **p $ \lt $0.05 and ***p $ \lt $0.01. Base categories for the treatment variables are: IV = 1, Hypothetical, and POR. PA and POR stand for pay-all and pay-one-randomly, respectively; n indicates the sum of payoffs is divided by the number of tasks.

Results are similar to the general pattern we observe with the BDM mechanism. Higher IVs reduce the level of misbidding; however, misbidding is unresponsive to the payment mechanism and the incentive scheme, i.e., whether the treatment is hypothetical or real.

4.3. The BDM mechanism vs. the SPA

The average difference of $Bid-IV$ is 0.29 in the BDM and -0.16 in the SPA, indicating that subjects on average overbid in the BDM mechanism and underbid in the SPA. In terms of absolute relative deviations ( $|Bid-IV|/IV$), subjects deviate on average 51.9% in the BDM mechanism and around 17.4% in the SPA indicating a substantially lower level of misbidding in the SPA. The magnitude of the improvement with the SPA is large.

To quantify and statistically test these differences, we also regressed bid deviations on the SPA dummy and demographic controls (standard errors are clustered at the individual level), and confirm that the SPA elicits smaller deviations from IVs ( $\hat{b}=-0.446, se=0.023$). Similarly, for absolute bid deviations, the SPA elicits 33.9% smaller bids than the BDM ( $se=0.012$).

Section C in the Online Appendix shows additional analysis where we explore whether subjects’ behavior is consistent with game form misconception. While we find no differences in the payment mechanisms and incentive schemes, design features such as the IV and the support of the distribution have an impact on the bidding behavior. The results also clearly indicate that the SPA induces behavior that is closer to the IV and reduces the likelihood of misbidding compared to the BDM mechanism.

5. Discussion and conclusions

While most previous work on incentive-compatible payment schemes is theoretical, this paper explored the effects of incentive schemes and payment mechanisms in economic experiments using an empirical approach that focuses on value elicitation across two experimental studies with a large sample. Given the abundant literature showcasing empirical deviations from theoretical expectations, we argue that an empirical approach is needed in the incentives and payment mechanisms argument to voice the outcomes produced by participants in experiments. We found that while the nature of the incentive – hypothetical or real – had minimal impact on participants’ bidding behavior, the design elements, such as the magnitude of induced values and the range of offers, significantly influenced outcomes. Specifically, larger induced values and smaller offer ranges led to more accurate bidding, aligning closer to theoretical expectations. Therefore, our results suggest that design elements in the experiment environment may influence decision-making more than the incentive scheme.

Comparing the BDM mechanism (personal uncertainty) with the SPA (strategic uncertainty), the latter showed an improvement in aligning bids with the induced values, indicating that SPA produces less missbidding than the former. When comparing different payment mechanisms, decision-making noise and misconceptions about payoff functions were minimal across both auction mechanisms. A potential source that may be driving differences between the BDM and the SPA (and one that we cannot answer with the current set of experiments) is failures of contingent reasoning, that is, the cognitive difficulties faced by subjects when considering all the possible outcomes (Martínez-Marquina et al., Reference Martínez-Marquina, Niederle and Vespa2019). Assuming behavior might be influenced by the distribution of others’ bids (Georganas et al., Reference Georganas, Levin and McGee2017, find that subjects respond systematically to out-of-equilibrium incentives in SPAs), a subject may go through more strenuous mental gymnastics in the BDM mechanism to consider all possible contingencies; hence, the lower rate of mistakes in the SPA. Others have argued that the shape of the payoff function renders individual deviations from optimal behavior more costly in percentage terms in the SPA than in the BDM mechanism, and the loss from deviations is increasing in the number of bidders (Noussair et al., Reference Noussair, Robin and Ruffieux2004).

Our findings suggest that the effectiveness of incentive mechanisms in eliciting underlying preferences in economic experiments is complex. While certain design elements like the magnitude of rewards and range of offers play a critical role, the choice of elicitation mechanism (BDM vs. SPA) also significantly impacts the accuracy of outcomes. These uneven behavioral responses highlight the need for careful consideration of behavioral factors in experimental design to ensure the reliability and validity of results in economic research. We note that we included strict attention checks, which may be distinct from other studies.

We conclude by asserting the growing need to understand the complex interplay between cognitive effort and improved choices. The mounting number of perplexing null results on hypothetical bias can only be explained by securing a tighter grasp on this relationship. We call these results perplexing because even theoretically improper incentives yield identical responses. At the same time, more opportunities for mistakes and more complex decision problems can exacerbate differences both between and within methods. It is imperative that we learn more about the empirical nature of payment mechanisms to counterbalance the predominantly theoretical nature of the existing literature.

Supplementary material

The supplementary material for this article can be found at https://doi.org/10.1017/eec.2026.10042.

Acknowledgements

Part of this work was completed while Andreas Drichoutis was a Fulbright Visiting Scholar at Texas A&M University. We would like to thank Tim Cason and Uri Gneezy for helpful comments. We received financial support for this project from the Institute for Advancing Health through Agriculture at Texas A&M University. The replication material for the study has been deposited with the Open Science Framework at https://doi.org/10.17605/OSF.IO/2QPNW.

Footnotes

¹ In contrast, Blavatskyy et al. (Reference Blavatskyy, Ortmann and Panchenko2022) find that the Allais’ Paradox is likely to be observed in experiments with, among others, high hypothetical payoffs. Perhaps because most experiments could not afford Allais’ level of incentives, even if not adjusting for inflation, and only recruiting one subject. Results in Gneezy et al. (Reference Gneezy, Halevy, Hall, Offerman and van de Ven2024) strengthen this conclusion.

² That is, bids that would result in greater losses than available cash balances could lead to higher bidding due to losses being capped at zero.

³ There are additional studies that show that the accumulation of cash balances from earnings significantly affects bidding behavior including evidence of income targeting (Casari et al., Reference Casari, Ham and Kagel2007; Garvin & Kagel, Reference Garvin and Kagel1994).

⁴ We committed to a sample size based on the power calculations of our preregistration. The SD in our data are larger than we expected in our preregistration. More specifically, the realized standard deviations of $Bid - IV$ in our experiment (ranging from approximately 0.69 to 1.21 for the IV = $1 condition) are substantially higher than the 0.16 standard deviation used in our ex ante power calculations. The power analysis value of a 0.16 SD was derived from data in Kendall and Chakraborty (Reference Kendall and Chakraborty2022), who conducted a similar BDM task with an IV of $1 using an online sample in the Prolific platform. Our own study was preregistered to use the Forthright Access sample, and we adhered to the commitment in the preregistration. The higher variances observed in our data may reflect differences in platform characteristics, the broader support levels used in our bidding tasks, or other design features. Note that Stagnaro et al. (Reference Stagnaro, Druckman, Berinsky, Arechar, Willer and Rand2024) substantiates our chosen platform by concluding that “(r)esearchers asking questions involving social/political content and who have higher requirements of attention and participant naivete should consider the representative sample that was able to still obtain higher quality data (i.e., Bovitz-Forthright), but with front-end filters. However, this option is also on the higher end of price for the samples tested here.”

This issue as pointed by a referee raises potential ex post power concerns to detect small-size treatment effects. There is considerable debate in the methodological literature about the usefulness of post hoc (ex post) power analysis, with many scholars cautioning that such analyses are essentially a restatement of the p-value and offer limited additional insight (see Althouse, Reference Althouse2021; Gelman, Reference Gelman2019; Goodman & Berlin, Reference Goodman and Berlin1994; Hoenig & Heisey, Reference Hoenig and Heisey2001). Given this concern, and following common practice, we do not include a post hoc power analysis or update our preregistration to include a post hoc power analysis. However, we fully acknowledge the reviewer’s broader concern of power to detect small effect sizes and future research would benefit from anticipating a wider range of potential variances – particularly in studies using broad value supports or implemented across different online platforms or samples of non-student populations.

⁵ We interchangeably refer to offer prices as bids and vice versa.

⁶ Because of several behavioral biases associated with the BDM mechanism regarding the minimum and maximum values of the support distribution (they determine expectations, i.e., the probability of getting the induced value conditional on one’s offer price, as well as they can serve as price and loss anchors), we varied the support distribution within subjects instead of keeping it constant to a predetermined level. See Vassilopoulos et al. (Reference Vassilopoulos, Drichoutis and Nayga2024) and references therein for a discussion as well as Mamadehussene and Sguera (Reference Mamadehussene and Sguera2022).

⁷ This approach is consistent with established practices in the literature. For example, Charness and Levin (Reference Charness and Levin2009), Kagel and Levin (Reference Kagel and Levin2001, Reference Kagel and Levin2009) employed detailed instructions and examples to help participants understand the experimental mechanisms while refraining from prescribing specific actions. Detailed instructions, as demonstrated by Charness and Levin (Reference Charness and Levin2009), can significantly reduce errors and improve participants’ understanding of experimental mechanisms, although they may not fully eliminate behavioral biases. The rationale for this approach is twofold: first, to ensure that participants are sufficiently informed to engage meaningfully with the task, and second, to allow them to make independent decisions based on their own reasoning. As Kagel and Levin (Reference Kagel, Levin, Kagel and Roth2015) note, effective instructions in experimental auctions should clarify the mechanism and its intended purpose without introducing explicit directions that might bias participants’ behavior. While providing clear instructions is essential to reducing confusion and fostering engagement, it is important to recognize the potential for participants to overinterpret examples as implicit recommendations. Breig and Feldman (Reference Breig and Feldman2024) find that the majority of gains in understanding occur over the initial practice rounds. This tradeoff is a standard consideration in experimental design, and we have taken care to minimize such effects while maintaining the clarity of the instructions.

⁸ The second part of the study (not analyzed in this paper, see Ahles et al., Reference Ahles, Palma and Drichoutis2024) explored preferences for sustainable meat consumption, so we screened out about 3.9% of the sample (223 subjects) that were vegan or vegetarian. Although both parts of the study were preregistered under the same AEA RCT entry and share logistical elements such as participant recruitment and use of the BDM mechanism, they are conceptually and empirically distinct. Ahles et al. (Reference Ahles, Palma and Drichoutis2024) focused on homegrown valuation for a sustainable meat product, whereas the present study employs an induced value design to test incentive compatibility. No treatment arms or data were shared across studies. While the present study did not test for framing effects explicitly, it is plausible that such context cues may influence participant behavior in valuation tasks.

⁹ We designed our experiment online using Qualtrics, and for the experiment described in the next section, we used SMARTRIQS (Molnar, Reference Molnar2019) to allow for online interactivity.

¹⁰ Cox et al. (Reference Cox, Sadiraj and Schmidt2015) explored how different incentive schemes may affect preference elicitation in choice under risk. They compare eight different incentive schemes: i) pay-all-sequentially as subjects make choices (PAS), ii) pay-all at the end with independent draws for each decision (PAI), iii) pay-one randomly with prior information; that is, see all options in advance before choosing one (PORpi), iv) pay-one randomly with no prior information (PORnp), v) combining POR with PAS; options are played out sequentially as in PAS before the option that is relevant for payoff is randomly selected (PORpas), vi) pay-all correlated; pay all with one realization of the world at the end (PAC), vii) pay-all correlated but divide the payment by the number of choices in order to scale down payoffs similar to POR (PACn), viii) only one task is performed and paid (OT). Their findings indicate that individual behavior is significantly affected by the payoff mechanism, and this phenomenon is not unique to the pay-one-randomly mechanism (POR).

¹¹ While many researchers use statistical tests to check for balance of observable characteristics between treatments, the literature points to some pitfalls of this practice (e.g., Briz et al., Reference Briz, Drichoutis and Nayga2017; Deaton & Cartwright, Reference Deaton and Cartwright2018; Ho et al., Reference Ho, Imai, King and Stuart2007; Moher et al., Reference Moher, Hopewell, Schulz, Montori, Gotzsche, Devereaux, Elbourne, Egger and Altman2010; Mutz & Pemantle, Reference Mutz and Pemantle2015). Following this literature, we report in the Online Appendix standardized differences across treatments (Imbens & Rubin, Reference Imbens and Rubin2016; Imbens & Wooldridge, Reference Imbens and Wooldridge2009). Since the differences are pairwise comparisons of all treatment cells, for brevity, we only present comparisons between the payment-probabilities treatments. Cochran and Rubin’s Reference Cochran and Rubin1973 rule of thumb is that the standardized difference should be less than 0.25. We also compared the demographics between the pooled sample with the sample of inattentive subjects filtered out of the study. None of the variables shows an imbalance.

¹² Table A2 in the Online Appendix shows descriptive statistics (mean, standard deviation, median) for the relative absolute deviations by IV and upper limit of the support distribution. A larger support limit and a smaller IV increase relative absolute bid deviations. Figure A1 shows bid distributions by support level. One of our reviewers raised the possibility that respondents might anchor on the midpoint of the bidding interval, potentially indicating confusion or inattention. We examined this directly and find that only 5.71% (for a $4 upper bound), 5.59% (for a $5 upper bound), and 9.79% (for a $6 upper bound) of all submitted bids exactly match the midpoint of their respective intervals ($2, $2.50, and $3.00). These proportions do not suggest systematic midpoint anchoring and appear consistent with attentive engagement.

¹³ We also conducted mean comparison tests (t-tests), tests for the equality of standard deviations, including Levene’s robust test for equality of variances and Kolmogorov-Smirnov equality-of-distributions tests at the decision task level. To ensure robustness, we accounted for multiple hypothesis testing using a Bonferroni correction. The results consistently indicate a null effect, suggesting no statistically significant differences in means and variability across treatments.

¹⁴ Table A5 in the Online Appendix shows estimates where we add demographic controls to the specifications albeit we lose some observations due to missing values.

¹⁵ To explore whether sliders induced differences in bidding behavior compared to free box submission, we included an additional test at the end of the study. A subset of 503 subjects was randomly assigned to either a treatment where they had to submit a bid using a slider or a box. In this part of the study, subjects faced an induced value of $2, and the support was varied within subjects at two levels: $3 or $4, so each subject faced two bidding rounds. Subjects were paid for one randomly selected round on top of any other earnings. We regressed bid deviations from IV or absolute bid deviations from IV on the treatment variables and a set of demographic controls. We find that the slider does not have a statistically significant effect on bid deviations as compared to the box ( $\hat{b}=-0.049$, $se=0.057$ for the $2 support; $\hat{b}=-0.116$, $se=0.074$ for the $3 support) and that a higher support level induces higher misbidding ( $\hat{b}=0.328$, $se=0.043$ for the box; $\hat{b}=0.261$, $se=0.041$ for the slider). Results are similar if one uses absolute bid deviations as the dependent variable.

¹⁶ Note that correlation structures are determined endogenously by the auction group and hence whether pay-all is correlated or not, cannot be directly manipulated by the experimental design.

¹⁷ Bots were programmed to bid a random number within the allowed interval of bids. Subjects were informed that they would be matched with a bot if sufficient time elapsed without forming a group. However, no information regarding the bot’s bidding strategy was provided.

¹⁸ We initially launched a single treatment to assess sign-up rates. We uncovered that all invitations (for each treatment) had to go out during a short time window to avoid most participants matching with bots. Consequently, this initial test treatment has more observations.

¹⁹ Table A4 in the Online Appendix shows standardized differences of observable characteristics between the treatments and comparisons with subjects that did not finish the study. Because we observe a few small differences in some of the treatments, we control for these characteristics in subsequent analysis.

²⁰ Table A8 in the Online Appendix extends estimations by including demographic controls, while Table A9 includes in the estimations subjects that were matched with bots and adds the number of bots as an additional control. Results are robust to these additional specifications. Serizawa et al. (Reference Serizawa, Shimada and Tse2024) finds minimal effects in bidding behavior when subjects play against bots or against humans.

References

Ahles, A., Palma, M. A., & Drichoutis, A. C. (2024). Testing the effectiveness of lottery incentives in online experiments. American Journal of Agricultural Economics, 106(4), 1435–1453.10.1111/ajae.12460CrossRef Google Scholar

Althouse, A. D. (2021). Post hoc power: Not empowering, just misleading. Journal of Surgical Research, 259, A3–A6.10.1016/j.jss.2019.10.049CrossRef Google Scholar PubMed

Anderson, L. R., Freeborn, B. A., & McAlvanah, P. (2023). Pay every subject or pay only some? Journal of Risk and Uncertainty, 66, 161–188.10.1007/s11166-022-09389-6CrossRef Google Scholar

Aydogan, I, Berger, L., & Théroude, V. (2024). Pay all subjects or pay only some? An experiment on decision-making under risk and ambiguity. Journal of Economic Psychology, 104, 102757.10.1016/j.joep.2024.102757CrossRef Google Scholar

Azrieli, Y., Chambers, C. P., & Healy, P. J. (2018). Incentives in experiments: A theoretical analysis. Journal of Political Economy, 126(4), 1472–1503.10.1086/698136CrossRef Google Scholar

Baltussen, G., Post, G. T., van-den Assem, M. J., & Wakker, P. P. (2012). Random incentive systems in a dynamic choice experiment. Experimental Economics, 15(3), 418–443.10.1007/s10683-011-9306-4CrossRef Google Scholar

Beattie, J., & Loomes, G. (1997). The impact of incentives upon risky choice experiments. Journal of Risk and Uncertainty, 14(2), 155–168.10.1023/A:1007721327452CrossRef Google Scholar

Becker, G. M., DeGroot, M. H., & Marschak, J. (1964). Measuring utility by a single-response sequential method. Behavioral Science, 9(3), 226–232.10.1002/bs.3830090304CrossRef Google Scholar PubMed

Berlin, N., Kemel, E., Lenglin, V., & Nebout, A. (2026). Paying none, some or all? between-subject random incentives and preferences towards risk and time. Journal of Economic Psychology, 112, 102870.10.1016/j.joep.2025.102870CrossRef Google Scholar

Blavatskyy, P., Ortmann, A., & Panchenko, V. (2022). On the experimental robustness of the allais paradox. American Economic Journal: Microeconomics, 14(1), 143–63.Google Scholar

Bolle, F. (1990). High reward experiments without high expenditure for the experimenter? Journal of Economic Psychology, 11(2), 157–167.10.1016/0167-4870(90)90001-PCrossRef Google Scholar

Brañas Garza, P., Jorrat, D., & Espín, A. M. (2023). Paid and hypothetical time preferences are the same: lab, field and online evidence. Experimental Economics, 26, 412–434.10.1007/s10683-022-09776-5CrossRef Google Scholar

Breig, Z., & Feldman, P. (2024). Revealing risky mistakes through revisions. Journal of Risk and Uncertainty, 68, 227–254.10.1007/s11166-024-09429-3CrossRef Google Scholar

Briz, T., Drichoutis, A. C., & Nayga, R. M. (2017). Randomization to treatment failure in experimental auctions: The value of data from training rounds. Journal of Behavioral and Experimental Economics, 71, 56–66.10.1016/j.socec.2017.09.004CrossRef Google Scholar

Brown, A. L., Liu, J., & Tsoi, M. (2025). Testing strategy-proofness and simplicity refinements in elicitation mechanisms Available at SSRN: https://doi.org/10.2139/ssrn.4476764Google Scholar

Bó, I, Chen, L., & Hakimov, R. (2024). Strategic responses to personalized pricing and demand for privacy: An experiment. Games and Economic Behavior, 148, 487–516.CrossRef Google Scholar

Casari, M., Ham, J. C., & Kagel, J. H. (2007). Selection bias, demographic effects, and ability effects in common value auction experiments. American Economic Review, 97(4), 1278–1304.10.1257/aer.97.4.1278CrossRef Google Scholar

Cason, T. N., & Plott, C. R. (2014). Misconceptions and game form recognition: Challenges to theories of revealed preference and framing. Journal of Political Economy, 122(6), 1235–1270.10.1086/677254CrossRef Google Scholar

Charness, G., Gneezy, U., & Halladay, B. (2016). Experimental methods: Pay one or pay all. Journal of Economic Behavior & Organization, 131, 141–150.CrossRef Google Scholar

Charness, G., & Levin, D. (2009). The origin of the winner’s curse: A laboratory study. American Economic Journal: Microeconomics, 1(1), 207–236.Google Scholar

Clot, S., Grolleau, G., & Ibanez, L. (2018). Shall we pay all? An experimental test of random incentivized systems. Journal of Behavioral and Experimental Economics, 73, 93–98.10.1016/j.socec.2018.01.004CrossRef Google Scholar

Cochran, W. G., & Rubin, D. B. (1973). Controlling bias in observational studies: A review. Sankhyā: The Indian Journal of Statistics, Series A, 35(4), 417–446.Google Scholar

Cox, J. C., Sadiraj, V., & Schmidt, U. (2015). Paradoxes and mechanisms for choice under risk. Experimental Economics, 18, 215–250.CrossRef Google Scholar

Deaton, A., & Cartwright, N. (2018). Understanding and misunderstanding randomized controlled trials. Social Science & Medicine, 210, 2–21.10.1016/j.socscimed.2017.12.005CrossRef Google Scholar PubMed

Enke, B., Gneezy, U., Hall, B., Martin, D., Nelidov, V., Offerman, T., & van de Ven, J. (2023). Cognitive biases: mistakes or missing stakes? The Review of Economics and Statistics, 105(4), 818–832.10.1162/rest_a_01093CrossRef Google Scholar

Garvin, S., & Kagel, J. H. (1994). Learning in common value auctions: Some initial observations. Journal of Economic Behavior & Organization, 25(3), 351–372.10.1016/0167-2681(94)90105-8CrossRef Google Scholar

Gelman, A. (2019). Don’t calculate post-hoc power using observed estimate of effect size. Annals of Surgery, 269(1), e10.10.1097/SLA.0000000000002908CrossRef Google Scholar PubMed

Georganas, S., Levin, D., & McGee, P. (2017). Optimistic irrationality and overbidding in private value auctions. Experimental Economics, 20(4), 772–792.10.1007/s10683-017-9510-yCrossRef Google Scholar

Gneezy, U., Halevy, Y., Hall, B., Offerman, T., & van de Ven, J. (2024). How real is hypothetical? A high-stakes test of the Allais paradox. Harvard Business School Working Paper 25-005.Google Scholar

Gneezy, U., Imas, A., & List, J. (2015). Estimating individual ambiguity aversion: A simple approach. Working Paper 20982, National Bureau of Economic Research.10.3386/w20982CrossRef Google Scholar

Goodman, S. N., & Berlin, J. A. (1994). The use of predicted confidence intervals when planning experiments and the misuse of power when interpreting results. Annals of Internal Medicine, 121(3), 200–206. Erratum in: Ann Intern Med. 1995 Mar 15; 122(6): 478.CrossRef Google Scholar PubMed

Haaland, I, Roth, C., & Wohlfart, J. (2023). Designing information provision experiments. Journal of Economic Literature, 61(1), 3–40.10.1257/jel.20211658CrossRef Google Scholar

Hackethal, A., Kirchler, M., Laudenbach, C., Razen, M., & Weber, A. (2023). On the role of monetary incentives in risk preference elicitation experiments. Journal of Risk and Uncertainty, 66, 189–213.10.1007/s11166-022-09377-wCrossRef Google Scholar PubMed

Ham, J. C., Kagel, J. H., & Lehrer, S. F. (2005). Randomization, endogeneity and laboratory experiments: the role of cash balances in private value auctions. Journal of Econometrics, 125(1-2), 175–205.10.1016/j.jeconom.2004.04.008CrossRef Google Scholar

Hansen, R. G., & Lott, J. R. (1991). The winner’s curse and public information in common value auctions: Comment. The American Economic Review, 81(1), 347–361.Google Scholar

Harrison, G. W., & Swarthout, J. T. (2014). Experimental payment protocols and the bipolar behaviorist. Theory and Decision, 77, 423–438.10.1007/s11238-014-9447-yCrossRef Google Scholar

Ho, D. E., Imai, K., King, G., & Stuart, E. A. (2007). Matching as nonparametric preprocessing for reducing model dependence in parametric causal inference. Political Analysis, 15(3), 199–236.10.1093/pan/mpl013CrossRef Google Scholar

Hoenig, J. M., & Heisey, D. M. (2001). The abuse of power: The pervasive fallacy of power calculations for data analysis. The American Statistician, 55(1), 19–24.10.1198/000313001300339897CrossRef Google Scholar

Hosni, H., Segovia, M., & Zhao, S. (2024). Improving consumer understanding of pesticide toxicity labels: experimental evidence. Scientific Reports, 14, 17291.10.1038/s41598-024-68288-9CrossRef Google Scholar PubMed

Imbens, G. W. and Rubin, D. B. (2016). Causal inference for statistics, social, and biomedical sciences, An Introduction. Cambridge and New York, Cambridge University Press.Google Scholar

Imbens, G. W., & Wooldridge, J. M. (2009). Recent developments in the econometrics of program evaluation. Journal of Economic Literature, 47(1), 5–86.10.1257/jel.47.1.5CrossRef Google Scholar

Irwin, J. R., McClelland, G. H., & Schulze, W. D. (1992). Hypothetical and real consequences in experimental auctions for insurance against low-probability risks. Journal of Behavioral Decision Making, 5(2), 107–116.10.1002/bdm.3960050203CrossRef Google Scholar

Kagel, J. H., & Levin, D. (1986). The winner’s curse and public information in common value auctions. The American Economic Review, 76(5), 894–920.Google Scholar

Kagel, J. H., & Levin, D. (1991). The winner’s curse and public information in common value auctions: Reply. The American Economic Review, 81(1), 362–369.Google Scholar

Kagel, J. H., & Levin, D. (2001). Behavior in multi-unit demand auctions: Experiments with uniform price and dynamic vickrey auctions. Econometrica, 69(2), 413–454.10.1111/1468-0262.00197CrossRef Google Scholar

Kagel, J. H., & Levin, D. (2009). Implementing efficient multi-object auction institutions: An experimental study of the performance of the vickrey auction. Games and Economic Behavior, 66(1), 221–237.10.1016/j.geb.2008.06.002CrossRef Google Scholar

Kagel, J. H. and Levin, D. (2015). Auctions: A survey of experimental research. In Kagel, J. H., & Roth, A. E. (Eds.) The handbook of experimental economics (Vol. 2, pp. 563–637). Princeton, NJ, Princeton University Press.Google Scholar

Kendall, C., & Chakraborty, A. (2022). Future self-proof elicitation mechanisms. Available at http://dx.doi.org/10.2139/ssrn.4032946.Google Scholar

Li, Z., Müller, J., Wakker, P. P., & Wang, T. V. (2017). The rich domain of ambiguity explored. Management Science, 64(7), 3227–3240.10.1287/mnsc.2017.2777CrossRef Google Scholar

Machina, M. J. (1989). Dynamic consistency and non-expected utility models of choice under uncertainty. Journal of Economic Literature, 27(4), 1622–1668.Google Scholar

Mamadehussene, S., & Sguera, F. (2022). On the reliability of the BDM mechanism. Management Science, 69(2), 1166–1179.10.1287/mnsc.2022.4409CrossRef Google Scholar

March, C., Ziegelmeyer, A., Greiner, B., & Cyranek, R. (2016). Pay few subjects but pay them well: Cost-effectiveness of random incentive systems. (CESifo Working Paper No. 5988). CESifo, Munich. 10.2139/ssrn.2821053CrossRef Google Scholar

Martínez-Marquina, A., Niederle, M., & Vespa, E. (2019). Failures in contingent reasoning: The role of uncertainty. American Economic Review, 109(10), 3437–3474.10.1257/aer.20171764CrossRef Google Scholar

Moher, D., Hopewell, S., Schulz, K. F., Montori, V., Gotzsche, P. C., Devereaux, P. J., Elbourne, D., Egger, M., & Altman, D. G. (2010). CONSORT 2010 explanation and elaboration: Updated guidelines for reporting parallel group randomised trials. BMJ, 340, c869.10.1136/bmj.c869CrossRef Google Scholar PubMed

Molnar, A. (2019). SMARTRIQS: A simple method allowing real-time respondent interaction in qualtrics surveys. Journal of Behavioral and Experimental Finance, 22, 161–169.10.1016/j.jbef.2019.03.005CrossRef Google Scholar

Mustapa, M. A. C., Kallas, Z., López-Mas, L., Alamprese, C., Contiero, S., & Aguiló-Aguayo, I. (2025). Consumer attitudes, willingness to pay and hedonic evaluations of innovative legume gnocchi products. Journal of the Science of Food and Agriculture, 105(5), 2867–2878.10.1002/jsfa.14063CrossRef Google Scholar PubMed

Mutz, D. C., & Pemantle, R. (2015). Standards for experimental research: Encouraging a better understanding of experimental methods. Journal of Experimental Political Science, 2(2), 192–215.10.1017/XPS.2015.4CrossRef Google Scholar

Noussair, C., Robin, S., & Ruffieux, B. (2004). Revealing consumers’ willingness-to-pay: A comparison of the BDM mechanism and the Vickrey auction. Journal of Economic Psychology, 25(6), 725–741.10.1016/j.joep.2003.06.004CrossRef Google Scholar

Serizawa, S., Shimada, N., & Tse, T. T. K.. (2024). Toward an understanding of dominated bidding in a Vickrey auction experiment (Discussion Paper No. 1229). The Institute of Social and Economic Research, Osaka University, 6–1 Mihogaoka, Ibaraki, Osaka 567-0047, Japan. Revised April 2024.Google Scholar

Smith, V. L. (1976). Experimental economics: Induced value theory. The American Economic Review, 66(2), 274–279.Google Scholar

Stagnaro, M. N., Druckman, J., Berinsky, A., Arechar, A. A., Willer, R., & Rand, D. G. (2024). Representativeness and response validity across nine opt-in online samples. Nature Human Behavior. Advance online publication.Google Scholar

Vassilopoulos, A., Drichoutis, A. C., & Nayga, R. (2024). Reference dependence, expectations and anchoring in the Becker-DeGroot-Marschak Mechanism. Theory and Decision, 97, 637–683.10.1007/s11238-024-09989-5CrossRef Google Scholar

Veettil, P. C., Yashodha, Y., & Vecci, J. (2025). Hypothetical bias and cognitive ability: Farmers’ preference for crop insurance products. American Journal of Agricultural Economics, 107(3), 888–924.10.1111/ajae.12506CrossRef Google Scholar