Heterogeneity of Beliefs and Trading Behavior: A Reexamination

Abstract Combining experimental data sets from seven individual studies, including 255 asset markets with 2,031 participants, and 36,326 short-term price forecasts, we analyze the role of heterogeneity of beliefs in the organization of trading behavior by reproducing and reconsidering earlier experimental findings. Our results confirm prior evidence that price expectations affect trading behavior. However, heterogeneity in beliefs does not seem to drive overpricing and asset market bubbles, as suggested by earlier studies, and we find no indication of short-term beliefs being better determinants of trading behavior than longer-term beliefs.


I. Introduction
The heterogeneity of beliefs is a key element of trading in asset markets.This has its roots in theoretical arguments (Harrison and Kreps (1978), Scheinkman and Xiong (2003), and Hong and Stein (2007)) but is also suggested by empirical findings (Verardo (2009), Giglio, Maggiori, Stroebel, and Utkus (2021), and Meeuwis, Parker, Schoar, and Simester (2022)).Such heterogeneity in beliefs results from differences in how traders evaluate information (Detemple and Murthy We thank an anonymous reviewer and Jarrad Harford (the editor) as well as conference participants at 2022 Experimental Finance and the 2022 Behavioral Macroeconomics Workshop for helpful comments and suggestions.We also thank Tim Carlé and Tibor Neugebauer for clarifying details about their original analyses as well as Eric Guerci and Michelle Song for making the full data sets from Holt, Porzio, and Song (2017) and Duchêne, Guerci, Hanaki, and Noussair (2019) available.CRediT author statement: Conceptualization and Methodology: S.F., C.H., U.W.; Data curation, formal analysis, investigation, visualization, validation, and writing original draft: S.F., C.H.; Writingreview and editing: all authors.All errors are our own.(1994)), in how they make predictions via mechanisms, learning, or effort (e.g., Goeree and Hommes (2000)), in acquiring information (Basak (2005)), or from having differences in opinion, preferences or traits (Jouini and Napp (2007)).Understanding the relationship between the heterogeneity of beliefs and trading is thus paramount to understanding asset market behavior.
Testable implications of the heterogeneity of beliefs include that optimists are willing to buy and hold shares (e.g., Hirshleifer (1975), Harrison and Kreps (1978)) or that larger heterogeneity will increase overall trading (Varian (1985)).To credibly test these hypotheses, we need information about traders' underlying beliefs.This information, however, is not available in archival trading data (Bloomfield and Anderson (2010)) and can only be proxied imperfectly and indirectly: Anderson, Ghysels, and Juergens (2005), for example, use publicly available analyst forecasts, and Buraschi and Jiltsov (2006) use survey data related to investor sentiment.Experimental methodology, by contrast, allows us to directly study the relationship between beliefs and trading behavior by eliciting traders' beliefs (with appropriate incentives) while they trade in an experimental asset market (e.g., Haruvy, Lahav, and Noussair (2007); for a discussion on different incentive schemes in eliciting price forecasts, see Hanaki, Akiyama, and Ishikawa (2018)).At the same time, experiments allow for a high degree of control of the market environment and access to information.The experimental settings can be replicated under identical conditions, but also under different scenarios, yielding a sufficient number of independent observations to make statistical inferences.
Given that many experimental designs foster homogeneous beliefs by keeping information conditions identical for all traders, the no-trade theorem typically applies (Milgrom and Stokey (1982)). 1 Still, trading occurs in such asset market experiments (e.g., Kleinlercher and Stöckl (2021)).One explanation can be that beliefs are heterogeneous even though all relevant information is common knowledge.We show that, indeed, heterogeneous beliefs result in heterogeneous actions such that trading is possible, supporting theoretical arguments (most notably Miller (1977), Harrison and Kreps (1978)).Carlé, Lahav, Neugebauer, and Noussair (2019) study the effect of beliefs on trading behavior by analyzing asset market data from Haruvy et al. (2007).Their data include elicited price expectations (serving as a proxy for beliefs) as well as prices, bids and asks, and share inventories from six experimental call markets with up to nine traders each.In each period, before trading takes place, each subject predicts the clearing price for the upcoming period (short-term beliefs) and for all remaining periods (long-term beliefs).They conclude that price expectations guide trading behavior by showing that the periods' net purchases, shareholdings, and submitted orders, as well as subjects' earnings, are dependent on their respective short-and long-term beliefs.Nevertheless, their analysis builds upon data from only one study employing one particular experimental design (i.e., the setup by Smith, Suchanek, and Williams (1988)).This implies at least three critical issues that call for a more extensive analysis at a greater scale.First, they only consider data from one experiment, and second, the experimental data under consideration only includes six independent markets, resulting in limited statistical power (also see Ioannidis (2005)).The third issue comes with the chosen experimental setup, which yields results that are sensitive to seemingly small design variations (Kirchler, Huber, and Stöckl (2012)).In empirical finance, such a study would compare to finding evidence for a theoretical relationship considering data from one particular asset in one particular market, while neglecting other assets, other markets, and other market designs.Therefore, a more extensive empirical analysis of their results at a greater scale (considering more studies using different market designs yielding a much greater overall sample size) is necessary to test and, possibly, verify their findings.
An essential feature of the scientific discovery process is that replications are able to convince us "that we are not dealing with a mere isolated 'coincidence', but with events which, on account of their regularity and reproducibility, are, in principle intersubjectively testable" (Popper (1959), p. 23).The ability to replicate experimental results is one of the main methodological advantages of experimental finance (Bloomfield and Anderson (2010)).Recent studies have investigated whether experimental economics results do indeed replicate.For example, Camerer, Dreber, Forsell, Ho, Huber, Johannesson, Kirchler, Almenberg, Altmejd, and Chan (2016) report that about two-third of their attempted replications of 18 studies published in the American Economic Review and the Quarterly Journal of Economics were successful.However, one-third was not, although the authors implemented identical study designs.2This underlines the importance to critically examine prior work.
To reexamine the relationship between heterogeneity in beliefs and trading behavior and replicate Carlé et al.'s (2019) initial results, we administer an out of sample validation test, in which the results derived from one experimental setting are tested "out of sample" in an environment with a similar structure.We thus apply Carlé et al.'s analysis on different data sets derived from different experimental market designs in seven primary studies.We view this exercise (combining data from seven individual studies including 255 laboratory markets with a total of 2,031 participants and 36,326 short-term price forecasts, and applying an analysis identical to the original study to this newly compiled, rich data set) as a replication of the original results and hypotheses.3Accordingly, we confront prior claims with new evidence.
Our data set consists of experimental asset market data, including trading data and beliefs, generated by Eckel and Füllbrunn (2015), Eckel and Füllbrunn (2017), Holt et al. (2017), Duchêne et al. (2019), Huber, Bindra, and Kleinlercher (2019), and Weitzel, Huber, Huber, Kirchler, Lindner, and Rose (2020), to replicate Carlé et al.'s (2019) results.We also reanalyze the original data from Haruvy et al. (2007) to reproduce Carlé et al.'s observations.We select these studies for their recency, their rich designs with various treatments and subject compositions, their data availability, and because they all contain the relevant information regarding belief and trading data.The experiments portrayed in the selected studies consider different experimental design elements, such as continuous double auction markets (instead of call markets),4 a constant fundamental value (instead of a decreasing one),5 or the use of financial professionals (instead of student subjects), in addition to other design characteristics.By varying a number of different experimental design elements, the seven studies under consideration naturally examine different research questions.They all employ similarly structured asset market experiments aiming to better understand the development of asset prices as well as trading behavior and expectations.At the same time, these studies provide a broad set of variations in important market characteristics that might potentially affect not just market prices, but also market participants' beliefs and trading behavior.Together, these features allow us to control for different design elements beyond Carlé et al., aiming to generalize their findings on heterogeneous beliefs being a determining factor of trading behavior.
Our data analysis is analogous to Carlé et al. (2019).We reviewed their empirical observations and applied the respective analyses separately to each of the studies under consideration, as well as jointly using regressions with pooled data from all studies, controlling for the respective treatment variations.
First, our study contributes to the literature on the effect of heterogeneous beliefs on market performance.Many studies have suggested that heterogeneity of expectations drive trading behavior (e.g., Hong and Stein (2007)) and we do find such a clear relationship, confirming that beliefs determine subjects' trading behaviors.In a nutshell, we find that optimists tend to become buyers while pessimists tend to become sellers, which is also reflected in their respective bids and asks.
Moreover, we confirm that individuals form adaptive beliefs about market prices, in line with, for example, Smith et al. (1988) or Haruvy et al. (2007).When a subject's forecast is too high, she adjusts downward; when her forecast is too low, she adjusts upward.
We find no relationship between belief dispersion and overpricing, however, for the full set of considered studies.This result is, to some extent, surprising as it is in contrast to the theoretical arguments outlined by Miller (1977).He argues that without short selling, heterogeneous beliefs lead to higher market prices because optimistic traders have ample room for expressing their beliefs by purchasing assets, while pessimists run out of assets and cannot short sell.That argument holds "as long as the entire supply of the [asset] can be absorbed by a minority of the potential purchasers" (Miller (1977), p. 1153), that is, as long as the most optimistic traders, in particular, have the available cash to bid up the asset price to a level consistent with their beliefs.In all seven studies we consider, traders do have sufficient cash endowments to drive up market prices. 6Our result is also in contrast to empirical analyses using mere proxies for the heterogeneity in beliefs (e.g., Doukas, Kim, and Pantzalis (2006)), but does confirm previous experimental studies that find no effect of traders' divergence of opinion on price levels (Fellner and Theissen (2014)).
We also contribute to the experimental finance literature on asset market research related to market performance and trading behavior; our results might add to the understanding of heterogeneous results within treatments, such as differences in price bubble measures within similar market conditions (Palan (2013), Powell and Shestakova (2016)).
Finally, by replicating the findings of Carlé et al. (2019), we contribute to recent discussions on reproducibility and replicability in the social sciences in general, and in experimental economics and finance in particular (e.g., Camerer et al. (2016), Camerer, Dreber, andJohannesson (2019)).By modern standards in experimental research, the data set examined by Carlé et al. (2019) is comparatively small (six independent observations with nine traders each).A limited number of observations results in low statistical power and can thus reduce the likelihood of detecting a true effect.Importantly, however, it can also increase the likelihood that a given finding, albeit statistically significant, does not reflect a true effect (Ioannidis (2005), Button, Ioannidis, Mokrysz, Nosek, Flint, Robinson, and Munafò (2013)); it could be an artifact from the experiments' design, subjects, or context, or simply from chance.This, in turn, can lead to overestimated effect sizes and low reproducibility of results (e.g., Open Science Collaboration (2015), Camerer et al. (2016), Ioannidis, Stanley, andDoucouliagos (2017), andCamerer et al. (2019)).Our replication results show that this does not seem to be the case.We significantly enlarge the number of studies and the number of individual markets 6 To check whether cash constraints were binding, we considered how many assets a trader was able to buy for the current market price, on average.For comparability across studies, we also divided that number by the total assets available.We find that, in all studies, traders could initially purchase at least 10% of all shares, while later, they could purchase even more than the number of shares available, because i) the cash-to-asset ratio often increased (with dividend and interest payments or due to a declining fundamental value) during trading, and ii) prices had the tendency to collapse toward the end of trading.Hence, traders were not cash-constrained.Füllbrunn, Huber, Eckel, and Weitzel 1341 (n = 255) to consider the research questions posed by Carlé et al. and are indeed able to replicate their results (in different experimental settings and with different subject pools, but applying analogous statistical analyses).Compared to Carlé et al. (2019), we also decreased the chance that null results are false negatives.The present study thereby represents one of the first replication attempts relating to experimental asset market results and is also among the first studies applying meta-scientific principles in this line of research. 7Our results strengthen our confidence in experimental research in (financial) economics and provide additional evidence on its generalizability to a multitude of different settings and participant groups.
The remainder of this article proceeds as follows: In Section II, we explain the main elements of the experimental designs, introduce the variables of interest to run the analyses, and state the observations from Carlé et al.'s (2019) study that we aim to validate.We present our replication results of the regression analyses in Section III.In Section IV, we discuss our results and provide concluding remarks.For details on the specific experimental designs of all seven studies under consideration, we refer the reader to the Appendix.

II. Measurement and Methodology
In this study, we test whether the results of CLNN for the Haruvy et al. (2007) data also hold for different data sets.Therefore, we apply the same empirical strategy of CLNN to the data sets by EF, HPS, WH, and DGHN.Table 1 provides the observations stated in CLNN.In the following, we consider all observations from CLNN except observations 6 and 9. Observation 6 relates to market repetitions that are not available in the new data sets.Observation 9 results from simulations applied to a call market environment, which we cannot directly apply to the double auction environment.Hence, we test whether observations 1-5, 7, and 8 hold.For each observation, we compare our results to the first-repetition results of HLN.
For the analysis, we use similar measures as those employed by CLNN, listed in Table 2.The short-term belief (1, STB) is a subject's price forecast for the upcoming period.The long-term belief (4, LTB) is the average of a subject's price forecasts for the remaining periods.HLN, EF, and DGHN provide respective observations for all remaining periods; HPS and WH only provide price forecasts for the upcoming three periods.Hence, we created an additional measure, mid-term belief (2, MTB), which is then available for all studies under consideration.A further measure we introduce when considering CLNN's observation 4 is what we   Carlé et al. (2019).A checkmark (✓) indicates that we find a similar relationship in our study with a more extensive data set; a cross mark (✗) indicates that we are not able to replicate this observation.Note that we do not consider observation 6, as we only have one repetition of the market in three out of four additional studies under consideration (EF, HPS, and WH), while Carlé et al. (2019) considered all four repetitions in HLN.We also do not consider observation 9, which employs simulations based on the call market data from HLN.

Obs. Claim
Individual Beliefs and Behavior 1 Subjects who believe that prices will be higher tend to be buyers and subjects who believe that prices will be lower tend to be sellers (✓).Consequently, shareholdings are positively correlated with beliefs (✓). 2 Subjects who believe that prices will be higher submit higher bids and asks, and subjects who believe that prices will be lower submit lower bids and asks (✓). 3 Short-term beliefs are better determinants of trading behavior than long-term beliefs (✗). 4 Individuals increase (decrease) the price estimates in their long-term belief profile when their short-term belief has turned out to be below (above) the realized market price (✓).The short-term price estimates behave in the same manner (✓).

5
Subjects who accurately forecast asset prices, and subjects who expect prices close to fundamentals, earn higher profits (✓).

Belief Dispersion and Market Behavior
6 Short-term belief dispersion declines with experience (not considered).7 Belief dispersion has no significant effect on transaction volume (✓).Belief dispersion is associated with higher prices (✗).The initial belief dispersion can be indicative of the later market price level (✗).8 Belief dispersion affects the relative size of price changes (✓).The relative size of past price changes affects the dispersion of beliefs (✓).The latter effect is stronger than the former (✗).
Market Behavior, Belief Data, and Forecasting Market Behavior 9 Simulated prices (and quantities) based on the short-term belief profile resemble the actual ones observed in the experiment (not considered).

Measurement
Table 2 shows the measures used by Carlé et al. (2019) and employed in our analysis.B t gis equals the price forecast of subject i in market g in period t for period s; for s = t, the price forecast in period t was for period t.A bar accent indicates the average.P is the price and V the fundamental value.
(1) Short-term belief STBgit = B t gi,s (2) Mid-term belief MTBgit = ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ffi Füllbrunn, Huber, Eckel, and Weitzel 1343 call an overlapping belief (3, OB).A given period t's OB forecasts the same periods as the previous period t À 1's MTB but at different points in time.We are thus able to directly compare them.The measures relative belief-price deviation (5, RBPD) and relative belief-value deviation (6, RBVD) equal the average percentage difference between the short-term belief and the average period price or the fundamental value of the asset, respectively.Note that an asset's fundamental value is the present value of all discounted dividends and the discounted redemption value.Short-term belief dispersion (7, STBD) measures the period's standardized variability of short-term beliefs in a market.Finally, relative deviation (8, RD) measures overpricing for the entire market, meaning the average percentage difference between the average period price and the fundamental value (Stöckl, Huber, and Kirchler (2010)).
Parts of CLNN's observations refer to individual bids and asks.In the studies with call market data, a period observation for each subject included one bid, one ask, and one clearing price.In the studies with double auction market data, however, the subjects submitted multiple bids and asks throughout the trading period, trading several shares at different prices.At the same time, some periods have no trades at all or some subjects did not submit orders.Consequently, we consider each subject's average bid and ask in a period as the relevant variable.If a subject did not place an order, this subject had a missing value for that period.The price benchmark in each period is the target price (the average period price (EF and WH) or the clearing price (HLN, DGHN, and HPS)).
CLNN consider the ranks of the short-term and long-term beliefs.They order the beliefs in each period from the highest value (rank 9) to the lowest value (rank 1).The numbers provide an ordinal rank interpreted as a scale for optimism.A high rank means that subjects expect higher prices than subjects with low ranks.However, the number of traders was not the same in all markets.DGHN employed only six traders per market, while WH, HPS, HLN, and EF had either seven, eight, or nine traders in a market. 8For comparability, we harmonized the ranks in line with the number of participants in a market, that is, RANK HARMONIZED Þwith n being the number of participants in a market.With this procedure, each market's lowest and highest ranks are 1 and 9, respectively, with equally spaced ranks in between.
The analysis provides evidence using regressions separated by studies and regressions including pooled data from all considered studies.For the latter, it is unavoidable to standardize some variables to ensure comparability.Where necessary, we provide all relevant information about the applied standardization procedure.

III. Results
The results section closely follows CLNN in the order of the observations, as presented in Table 1.For the statistical analysis, we consider generalized least squares (GLS) regressions in line with CLNN.To save space, we only provide the coefficient estimates and refer to the supporting material provided in the dedicated OSF repository for the analysis code and detailed regression tables: osf.io/tpnq4.We end each consideration with a statement on whether we support the originally observed result.

A. Individual Beliefs and Behavior
We begin with observation 1, stating that "optimists" are buyers while "pessimists" are sellers.Therefore, we rank the short-term beliefs (rSTB) and the long-term beliefs (rLTB) to organize subjects into optimists (high rank) and pessimists (low rank).We correlate the ranked beliefs with the period's net purchases, that is, the number of shares purchased minus the number of shares sold (a subject i's net purchases in period t are thus calculated as NP t = S i,tÀ1 À S i,t with S denoting her shareholdings in a given period).A positive correlation between ranked beliefs and net purchases would confirm observation 1.
Result 1.We confirm CLNN's observation 1. Subjects who believe that prices will be higher tend to be buyers and subjects who believe that prices will be lower tend to be sellers.Consequently, shareholdings are positively correlated with beliefs.
Support. Figure 1 shows the average net purchases for each rank together with the number of observations in each rank.The pattern is similar in all studies.While low-rank subjects tend to sell shares (negative net purchases), the high-rank subjects buy shares (positive net purchases).The result in WH, in particular, shows an almost monotone increase. 9e confirm this graphical relationship using GLS regressions with net purchases (as well as with shareholdings) as the dependent variable and the ranked beliefs as the independent variable.Table 3 reports respective coefficients next to their significance levels.While in rows a, b, and c the results show separate regressions of rSTB, rMTB, and rLTB on net purchases, respectively, row d shows results of a regression with rSTB and rMTB as independent variables (all studies), and row e with rSTB and rLTB as independent variable (only HLN, EF, and DGHN).
For rows a, b, and c, we report significantly positive coefficients in the separate studies as well as in the joint regressions; we differentiate between joint regressions with HLN, EF, and DGHN (columns 1-3), and all studies (columns 1-5). 10As in CLNN, the results hold as well when considering end-of-period shareholdings (rows f, g, and h).11Hence, we present Result 1.
The second observation states that optimists tend to submit higher bids and asks, and pessimists tend to submit lower bids and asks.CLNN consider i) an individual consistency check for each subject over four market repetitions with 60 observations in total using Spearman correlations, and ii) a consistency test on the market level using GLS regressions.We neglect the individual correlations, as we only have one market per subject in all studies, and instead focus on regression analyses.

Impact of Short-Term Beliefs on Net Purchases
Figure 1 shows net purchases as a function of ranked short-term beliefs.The horizontal axis classifies individuals based on their submitted short-term belief in a given period.One indicates the lowest forecast and six/eight/nine, depending on the number of traders in a market, the highest.The vertical axis indicates the average net purchases in a period as a fraction of total available shares.For mid-ranks, we randomly assigned the lower and higher rank.Omitting the mid-ranks does not change the general pattern.The added labels in parentheses show the number of observations for each rank.Result 2. We confirm CLNN's observation 2. Subjects who believe that prices will be higher submit higher bids and asks, and subjects who believe that prices will be lower submit lower bids and asks.
Support.Table 4 reports the coefficients for the GLS regressions, analogous to Table 3 but with ranked bids and ranked asks as the dependent variable.CLNN restricted their analysis to bids below and asks above the submitted short-term belief.To make use of the full data set, we do not apply this restriction and thus use the bids and asks as discussed in Section II. 12 For all three belief measures (STB, MTB, and LTB), we find a significantly positive relationship between the ranked beliefs and the ranked bids in all regressions (rows a, b, and c).For the ranked asks (rows f, g, and h), we find a significant positive relationship for all models except for HLN.Nevertheless, given the overall support in all other specifications, we are confident in supporting Result 2.
CLNN then also suggest short-term beliefs to be a better determinant of trading behavior than long-term beliefs (observation 3).They support their claim by

Net Purchases and Shareholdings
Table 3 shows GLS regression results of net purchases and shareholdings on ranked short-term beliefs (rSTB), ranked midterm beliefs (rMTB), and ranked long-term beliefs (rLTB).Based on a Hausman test, we report results of random effects regressions or fixed effects regressions (marked with an "f").*, **, and *** indicate significance at the 10%, 5%, and 1% levels, respectively.The columns refer to the data set under consideration, whether the separate studies or a merger.Rows a, b, c, f, g, and h show the coefficients for X t ∈ rSTB,rMTB, rLTB f g related to a regression Y t = α þ βX t .Rows d, e, i, and j show coefficients related to a regression Tables 1 (a, d), 2 (b, c), and 5 (c, f The buy offers and sell offers show some extreme values.We address this problem with three robustness checks: i) we use a traders' median instead of her mean offer in a given period (assuming that subjects submit a mix of both "serious" offers as well as extreme offers without an expectations to be accepted by another trader), ii) truncate the lowest 10% of buy offers and the highest 10% of sell offers in a study, and iii) truncate the data set by excluding sell offers higher than five times the market price and buy offers lower than one-fifth of the fundamental value.The results are reported in Tables B1, B2, and B3 in the Appendix and are in line with the results provided in Table 4. Füllbrunn, Huber, Eckel, and Weitzel 1347 arguing that in a GLS regression with ranked short-term and ranked long-term beliefs, the latter is not a significant determinant of net purchases, shareholdings, or bids and asks.
Result 3. We cannot confirm CLNN's observation 3. Short-term beliefs are neither better nor worse determinants of trading behavior than long-term beliefs.

Support.
In line with CLNN, we firstly consider short-term and long-term beliefs in joint regression models (Tables 3 and 4, rows d, e, i, and j).Comparing the coefficients for ranked short-and long-/mid-term beliefs, we find both rMTB and rLTB to be significantly positive in most cases, while rSTB is sometimes but not always significantly positive.Wald tests comparing the coefficients within each joint regression model (i.e., rSTB vs. rMTB and rSTB vs. rLTB) either show no significant difference or show that rLTB and rMTB coefficients are significantly greater than rSTB coefficients.However, as CLNN already pointed out, short-term and long-term beliefs are correlated. 13Hence, multicollinearity potentially weakens the predictive power of our joint regression models, the precision of the estimated coefficients, and the interpretation of the respective p-values.
When comparing the coefficients between the separate regression models for net purchases in Panel A of Table 3 (i.e., between rows a and b, and between rows a and c), we find all coefficients to be highly significant and positive, but those for

Ranked Bids and Asks
Table 4 shows GLS regression results of ranked bids and ranked asks on ranked short-term beliefs (rSTB), ranked mid-term beliefs (rMTB), and ranked long-term beliefs (rLTB).Based on a Hausman test, we report results of random effects regressions or fixed effects regressions (marked with an "f").*, **, and *** indicate significance at the 10%, 5%, and 1% levels, respectively.The table's setup is analog to Table 3.This table compares to CLNN's Tables 3 (a, d), 4 (b, c), and 6 (c, f In particular, Spearman rank correlations between rSTB and rMTB are 0.65 for HLN, 0.77 for EF, 0.71 for DGHN, 0.69 for WH, and 0.73 for HPS; and between rSTB and rLTB, they are 0.53 for HLN, 0.56 for EF, and 0.57 for DGHN.Note that short-term beliefs are less strongly correlated with long-term beliefs than with mid-term beliefs.
rMTB and rLTB are generally larger than those for rSTB.When looking at Shareholdings in Panel B of Table 2, we cannot identify a clear pattern.Similarly, in terms of the variation in Net Purchases and Shareholdings that is predictable from rSTB or rMTB/rLTB, looking at R 2 does not yield a pattern that would suggest short-term beliefs to be a differently strong predictor of trading behavior. 14aken together, our considerations yield no support for short-term beliefs to be better determinants of trading behavior than mid-/long-term beliefs and remain somewhat inconclusive.Hence, our analysis cannot support CLNN's observation 3.
The fourth observation then considers the updating of expectations: how do subjects adjust their beliefs given new experiences?CLNN state that subjects who are too optimistic downward-adjust their reported beliefs, while those who are too upward-adjust their reported beliefs.
Result 4. We confirm CLNN's observation 4. Individuals increase their price estimates in their belief profiles when their short-term belief has turned out to be below the realized market price and decrease their estimates when their belief has turned out to be above the realized market price.
Support.For this analysis, we consider the four different belief measures, STB, MTB, LTB, and OB.In line with CLNN, we find that subjects receive an upward price impulse if their short-term belief falls short of the realized price (i.e., STB g,i,tÀ1 < P g,tÀ1 ) and a downward price impulse if their short-term belief exceeds the realized price (i.e., STB g,i,tÀ1 > P g,tÀ1 ).We now test whether this impulse can explain the change in beliefs.
In particular, we test whether an upward impulse leads to an increase in short-term beliefs (i.e., STB g,i,t À STB g,i,tÀ1 > 0), in mid-term beliefs (i.e., MTB g,i,t À MTB g,i,tÀ1 > 0), in long-term beliefs (i.e., LTB g,i,t À LTB g,i,tÀ1 > 0), in overlapping beliefs (i.e., OB g,i,t À MTB g,i,tÀ1 > 0), and whether a downward impulse leads to a decrease in the same metrics.Note that overlapping beliefs compare two forecasts made in different periods, but for the same target price; for example, what one forecasts in t for the price to be in t þ 1 versus what one forecasts in t þ 1 for the price to be in t þ 1.The impulses for MTB and LTB compare price forecasts for different periods. 15CLNN considered only long-term beliefs with a different number of periods under consideration. 16he stacked bar charts adding up to 100% in Figure 2 depict the percentage of valid observations for subjects who changed their beliefs in line with their price impulse (blue), those who change their belief against their price impulse (red), and those who do not change their beliefs (gray).In all four graphs, we can see that in the overall majority of forecasts across all studies, subjects adjust their beliefs in line with the impulse for all four measures.This effect is strongest in HLN and HPS, in which about 75% of forecasts follow the respective impulse.
In line with CLNN, we count the number of subjects who follow their impulse in the majority of periods.Table 5 reports the respective frequencies for the four

Impact of Price Impulses
Figure 2 shows the relative frequency of individual adjustments from upward or downward impulses on short-term beliefs (STBs), overlapping beliefs (OBs), mid-term beliefs (MTBs), and long-term beliefs (LTBs; studies HLN, EF, and DGHN only).If the prior STB exceeds (falls short of) the realized price, the subject receives a downward (upward) price impulse.The chart shows how frequently subjects change their belief profile from that of the prior period in the direction of the impulse (blue), versus in the other direction.The number of observations for the STBs are 696 for HLN, 4,707 for EF, 1,782 for DGHN, 21,227 for WH, and 5,683 for HPS.

Frequency of Impulse Followers
Table 5 shows the frequency of impulse followers.The numbers represent the relative frequency of subjects who follow their impulse in more than half of all periods for the four belief levels.belief categories under consideration.For all studies separately, but also when all studies are taken together, more than 70% of the subjects follow the price impulse.Simple binomial tests for each cell show that significantly more than 50% of the subjects follow the price impulse (p < 0:001 in each cell).In line with CLNN's observation 4, we conclude that the majority of individuals increase (decrease) their price estimates when their short-term belief turned out to be below (above) the realized market price.

B. Beliefs and Earnings
Observation 5 states that subjects earn higher profits when they have accurate beliefs, that is, when they are closer to the target price or expect prices close to the fundamental value.
Result 5. We confirm CLNN's observation 5. Subjects who accurately forecast asset prices and subjects who expect prices close to fundamentals earn higher profits.
Support.To measure deviations from the price and the fundamental value, we apply the RBPD and the RBVD (equation ( 6) in Table 2), respectively.Both measures are averages over t periods for each subject.A 0 indicates the price forecast to hit the target (period price or fundamental value).A higher number means a higher deviation from the target.We apply GLS regressions using the rank of the deviation measure (rRBPD and rRBVD) to predict the ranked end-of-market profit (rPROFIT) of a subject.A negative coefficient indicates that more accurate predictions yield higher end-of-market profits.All coefficients in Table 6 are negative and, apart from a in HPS, are significant.We conclude that observation 5 holds.

C. Belief Dispersion and Market Behavior
Observation 7 claims that belief dispersion, that is, the level of belief heterogeneity, does not affect trading volume but does affect overpricing.
Result 6.We partially confirm CLNN's observation 7.In line with CLNN, we find no robust effect of belief dispersion on transaction volume.Contrary to CLNN,

Ranked Profits
Table 6 shows GLS regression results of ranked profits on ranked deviations from the price (rRBPD) and fundamental value (rRBVD) based on equations (4) and (5) in Table 2. Based on a Hausman test, we report results of random effects regressions or fixed effects regressions (marked with an "f").*, **, and *** indicate significance at the 10%, 5%, and 1% levels, respectively.The columns refer to the data set under consideration.Rows a and b report the coefficients for the regressions of type rPROFITt = α þ βrRBPD and rPROFITt = α þ βrRBVD (with and without controls).This table compares to CLNN's Table 7 however, we find no association between belief dispersion or initial belief dispersion and overpricing.
Support.The testable claims from observation 7 consider group-level measurement, relating belief dispersion to transaction volume and overpricing.Table 7 provides the regression results of transaction volume on both belief dispersion and lagged transaction volume in line with CLNN.We find a significant effect of STBD on volume in EF.However, none of the other studies show a meaningful effect, and the effect in the joint regression with all studies is not significant at the 5% level.Moreover, the relationship is not significant for mid-term (MTBD) or long-term belief dispersions (LTBD) in any of the considered studies.Taking these results together, we conclude that there is no relationship between belief dispersion and trading volume.
According to the second claim of CLNN's observation 7, belief dispersion is associated with higher prices, that is, we should expect a positive correlation between STBD, MTBD, or LTBD and RD, indicating overpricing.For example, RD = 0:1 indicates the market is overpriced by 10%, that is, prices are 10% higher than the average fundamental value.As RD is calculated at the market level, we used ordinary least squares (OLS) regressions with only 6 independent observations in HLN,43 in EF,35 in DGHN,145 in WH,and 26 in HPS to test whether belief dispersion affects overpricing.MTBD (c), and LTBD (e).The results are mixed, as the coefficients are positive in some models and negative in others; across studies, we find no clear pattern.For STBD, we find a strong effect in WH (4), which also translates to the joint regressions in (1-5).However, the belief dispersion in WH is substantially skewed compared to the other studies.We can see this when looking at the percentiles: for example, the quartiles in WH for STBD are 0.042, 0.076, and 0.012, while in EF they are 0.219, 0.300, and 0.417.The effect is even worse for MTBD; here the mean is 694,262 in WH and merely À1:202 in EF.Hence, the result are quite different across studies and in particular the measures in WH are quite diverse.However, using criteria for outliers across studies is a difficult endeavor and cannot reliably de-bias our estimates in WH.CLNN consider regressions with six markets over four rounds for a total of 24 (at least partially) dependent observations and suggest a relationship between belief dispersion and overpricing for both the short and the long term.We suspect that the positive correlation is due to the repetition of markets rather than the relationship between overpricing and belief dispersion.Typically, RD becomes smaller over time, as do the belief dispersion values (Haruvy et al. (2007)).The omitted time variable in their analysis might drive this result.
Result 7. We partially confirm CLNN's observation 8. Short-term belief dispersion has a significantly positive relationship with price changes but neither mid-term nor long-term belief dispersion has such a significant relationship.
Support.Observation 8 suggests a relationship between belief dispersion and price change.CLNN support their statement using a GLS regression considering the effect of belief dispersion on the absolute change in prices between periods, ΔP t = |P g,t À P g,tÀ1 |=P g,tÀ1 .We provide similar regressions in Table Even though we find no relationship for HLN, we do find a significant relationship between STBD and price change in all other studies, as well as when considering all five studies (rows a and d). 17We find no such effect for MTBD or LTBD.Interestingly, we find the same results as CLNN but not for HLN, which might be due to the fact that we only consider the first repetition and therefore include only six markets, while CLNN also considered subsequent repetitions.
In the second part of observation 8, CLNN find that the effect of price changes on dispersion are stronger than the effect of dispersion on price changes.They reach this conclusion by comparing significant levels between regressions (e.g., comparing (a) STBD with (d) L_ΔPRICE).When applying the same procedure on the five studies under consideration, we do not come to a similar conclusion.However, in order to test this claim, we would need to apply a different statistical toolset, which we deem infeasible given the structure of the data at hand (unbalanced, partially nonstationary, biased measurements of dispersion, partially missing observations); we follow CLNN in disregarding such tests.for this observation suggestive rather than providing clear evidence.In line with Fellner and Theissen (2014), we found no relationship between belief dispersion and overpricing for all studies taken together.This result is quite relevant for experimental research that aims to understand bubble formation.If this observation holds, what other factors foster bubble formation?Moreover, our results are in contrast to empirical results using proxies for heterogeneity in beliefs.For example, Doukas et al. (2006) showed such a relationship using a diversity measure that considered analysts' heterogeneous expectations as a proxy for divergence in opinion.
Transparency, openness, and reproducibility are key features of research integrity that strengthen the validity of scientific results (Nosek, Alter, Banks, Borsboom, Bowman, Breckler, Buck, Chambers, Chin, Christensen, Contestabile, Dafoe, Eich, Freese, Glennerster, Goroff, Green, Hesse, Humphreys, Ishiyama, Karlan, Kraut, Lupia, Mabry, Madon, Malhotra, Mayo-Wilson, McNutt, Miguel, Paluck, Simonsohn, Soderberg, Spellman, Turitto, VandenBos, Vazire, Wagenmakers, Wilson, Yarkoni (2015)).With respect to the seven primary studies in this reconsideration, we note that unfortunately, not all data and procedures were easily accessible without consulting the original authors.The same is true for the exact analytic methods applied by Carlé et al. (2019).While we have seen some movement in this direction in recent years, many journals in economics and finance remain lenient. 18This reinforces previous calls for journals to promote principles of open science (Nosek et al. (2015), Christensen and Miguel (2018)), and we thus aim to reiterate the importance of data and analysis transparency for reproducibility and replicability, to maintain the quality of experimental studies, and restore trust in scientific results.

A. Data Set and Experimental Design Characteristics
We provide an overview of the design elements in Table A1.HLN considered the Smith et al. (1988) design.Nine traders traded 18 shares during a sequence of 15 call market trading periods.At the end of every period, each share paid a dividend of 0, 4, 14, or 30 francs with equal probability.The fundamental value in each period was just the sum of expected dividend payments (i.e., 180 in period 1, 168 in period 2, …, and 12 in period 15).Before each period, the subjects had to forecast the clearing price for each of the remaining periods.For example, in period 10, every subject submitted six price forecasts: one each for periods 10, 11, 12, 13, 14, and 15.The subjects received payment for accuracy, that is, the distance between the forecasted prices and the realized clearing price.HLN had six cohorts with nine traders each who participated in four repeated markets.
DGHN considered a similar design with different parameters.They had 6 traders per market, 10 periods, and 2 repetitions, as well as different endowments and dividend payments.In addition, their treatments partially permitted borrowing and short selling.In total, they had 35 cohorts with 210 student subjects.
EF considered a design similar to HLN with doubled the cash endowments and dividend payments.Most importantly, the trading facility was a continuous double auction instead of a call market.Hence, the subjects had to forecast the average period price and not the clearing price.In total, EF ran 43 markets without repetition.

Experimental Design Comparison
Table A1 shows the experimental parameters of the primary studies under consideration.We use the abbreviations HLN (Haruvy et al. (2007)), EF (Eckel and Füllbrunn (2015)), HPS (Holt et al. (2017)), DGHN (Duchêne et al. (2019)), and WH (Huber et al. (2019), Weitzel et al. (2020)).Note that we combine data from Eckel and Füllbrunn (2015) and Eckel and Füllbrunn (2017), as well as data from Huber et al. (2019) and Weitzel et al. (2020), into one study each (EF and WH), as those publications resulted from the same respective study setup.a We consider the first market only.HLN: 5 markets with 9 traders, 1 market with 8 traders; WH: 19 markets with 7 traders, 121 markets with 8 traders, 5 markets with 9 traders, 244 short term beliefs are missing.b HLN and EF: 1/3 of traders is endowed with one of the three pairs each; WH: endowment depends on the treatment but is identical across traders in a market.c HLN, EF, DGHN elicit long-term beliefs for all remaining periods; HPS and WH only elicit beliefs for up to two periods ahead.d HLN and HPS do not specify the time of their experimental sessions.

TABLE 1
Observations Reported inCarlé et al.
Table 1 reproduces the nine observations reported in

TABLE 6
Table 8 reports the coefficients for STBD (a),

TABLE 7 Volume
Table7shows the GLS regressions of transaction volume (VOLgt ) on short-term and mid-term belief dispersions (STBDgt and MTBDgt ), controlling for lagged transaction volume (L:VOLUME, VOLgtÀ1) in rows a and b.Row c considers long-term belief dispersion (LTBDgt ) which is only available for studies HLN, EF, and DGHN.Based on a Hausman test, we report results of random effects regressions or fixed effects regressions (marked with an "f").*,**, and *** indicate significance at the 10%, 5%, and 1% levels, respectively.This table compares to CLNN's Table8.

TABLE 8 Relative Deviation
Table8reports coefficients for OLS regressions of relative deviation (RD) on short-term beliefs dispersion (STBD), mid-term beliefs dispersion (MTBD), and long-term belief dispersion (LTBD).Rows a, c, and f consider the market average, while rows b, d, and f consider the first period belief dispersions only.