An experiment on a multi-period beauty contest game

Nobuyuki Hanaki; Yuta Takahashi

doi:10.1017/eec.2026.10048

An experiment on a multi-period beauty contest game

Published online by Cambridge University Press: 15 May 2026

Nobuyuki Hanaki

and

Yuta Takahashi

Show author details

Nobuyuki Hanaki*: Affiliation:
Institute of Social and Economic Research, the University of Osaka, and University of Limassol, 6-1 Mihogaoka, Ibaraki, Osaka, Japan University of Limassol, Cyprus
Yuta Takahashi: Affiliation:
Institute of Social and Economic Research, the University of Osaka, and University of Limassol, 6-1 Mihogaoka, Ibaraki, Osaka, Japan
*: Corresponding author: Nobuyuki Hanaki; Email: nobuyuki.hanaki@iser.osaka-u.ac.jp

Article contents

Abstract
Introduction
A model of a multi-period beauty contest
Experimental design
Benchmark analysis: rational expectations equilibrium
Analysis of the experimental data
Concluding remarks
Footnotes
References

Rights & Permissions

Abstract

We present and conduct a novel experiment on a multi-period beauty contest game. Leveraging the multi-period feature, we investigate how participants revise their forecasts in periods when new information—such as shocks or announcements—arrives and how they form their expectations in the absence of new information. We make two key contributions. First, we develop a new method based on forecast revisions to evaluate whether participants behave in a forward-looking manner. The experimental results show that participants do react to anticipated shocks: namely, the announcements of future shocks. Second, we identify a new strategic environment effect during periods without new information; only when the game exhibits strategic complementarity do participants use extrapolation and expect continuously rising prices. This finding suggests that expectation formation is endogenous to the economic environment; and policy design should thus take this endogeneity into account.

Keywords

Expectation formation forward-lookingness learning-to-forecast experiment strategic environment effect C92 D84 E70

Information

Type: Original Paper
Information: Experimental Economics , First View , pp. 1 - 26

DOI: https://doi.org/10.1017/eec.2026.10048 [Opens in a new window]
Creative Commons: This is an Open Access article, distributed under the terms of the Creative Commons Attribution-NonCommercial-ShareAlike licence (http://creativecommons.org/licenses/by-nc-sa/4.0), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the same Creative Commons licence is used to distribute the re-used or adapted article and the original article is properly cited. The written permission of Cambridge University Press or the rights holder(s) must be obtained prior to any commercial use.
Copyright: © The Author(s), 2026. Published by Cambridge University Press on behalf of the Economic Science Association.

1. Introduction

Rational expectations have long been a cornerstone in macroeconomic modeling. Over the last five decades, a substantial body of macroeconomic and experimental economics literature has evaluated the rational expectations equilibrium (REE) hypothesis. One notable finding from the macroeconomics literature is that models with the strong form of rational expectations can lead to erratic implications, such as the forward guidance puzzle. Consequently, researchers have become increasingly skeptical of this assumption and have proposed alternative frameworks (e.g., Angeletos & Lian, Reference Angeletos and Lian2018; Farhi & Werning, Reference Farhi and Werning2019; García-Schmidt & Woodford, Reference García-Schmidt and Woodford2019).

Experimental evidence from “learning-to-forecast” experiments further calls into question the validity of the REE hypothesis. These experiments, pioneered by Marimon and Sunder (Reference Marimon and Sunder1993), reveal that participants’ behaviors often diverge from the rational expectations benchmark, especially when the aggregate outcome is determined endogenously by individual forecasts in the presence of positive expectational feedback between the aggregate outcome and individual forecasts (Bao et al., Reference Bao, Hommes and Makarewicz2017; Heemeijer et al., Reference Heemeijer, Hommes, Sonnemans and Tuinstra2009; Hommes et al., Reference Hommes, Sonnemans, Tuinstra and van de Velden2005).Footnote ¹ The presence of expectational feedback—where individual expectations influence the aggregate outcome, which in turn shapes future expectations—can lead to dynamics that are better captured by models incorporating backward-looking expectation formation (Anufriev & Hommes, Reference Anufriev and Hommes2012; Evans et al., Reference Evans, Gibbs and McGough2025).Footnote ²

However, most existing learning-to-forecast experiments are essentially of the single-period type.Footnote ³ Participants only need to forecast the outcome of today or in one future period (e.g., tomorrow), and the realized outcome is a function of submitted forecasts regarding this single period (seeBao et al., Reference Bao, Hommes and Pei2021, for a survey).Footnote ⁴ As a result, it is difficult to provide useful insights for macroeconomists who study expectation formation in a multi-period environment and policymakers who need to know how agents form expectations to design better policies in a dynamic world. In typical macroeconomic models and dynamic games more generally, individuals forecast not only the actions of others in one period but also their actions in multiple future periods, and these forecasts about multiple future periods determine current outcomes. The single-period framework does not enable us to analyze how these series of expectations change in such a multi-period environment, such as in response to news about future shocks.

To fill this gap in the literature, we propose and conduct a novel experiment of a multi-period beauty contest game where participants submit a sequence of forecasts for multiple future periods. Based on this new experimental framework, we examine how individuals revise their expectations in response to new information, as well as how they form forecasts during normal periods without such information. We aim to contribute to the literature by providing this new dynamic experimental framework, offering a new method to assess the forward-lookingness of expectations, and identifying a new strategic environmental effect.

Our experiment is motivated by the core component of the canonical New-Keynesian model (e.g., Woodford (Reference Woodford2003) and Galí (Reference Galí2015), initially developed by Calvo (Reference Calvo1983)). In our theoretical model, there are many firms, and each firm faces the same linear demand curve. The demand for a given firm depends on its own price and the aggregate price level. While an increase in the firm’s own price reduces the demand for that firm, demand may increase or decrease when other firms raise their prices. If firms can set their prices freely, they choose prices to maximize profit, taking other firms’ prices as given. An REE price is a fixed point in this process and is often referred to as the optimal flexible price.

Following Calvo (Reference Calvo1983), we introduce a friction that limits firms’ ability to choose the optimal flexible price. Instead, firms can only change their prices with a probability of $1-\theta.$ This constraint induces them to take future prices into account because future profit depends on current price choices; if prices cannot be adjusted later, firms must compete in future periods with the price selected today. Hence, they must incorporate future market conditions and price expectations into their current decisions. This mechanism introduces an additional dynamic expectational feedback effect, since beliefs about future prices influence today’s actions.

We transform our theoretical model into a multi-period version of the learning-to-forecast game in Marimon and Sunder (Reference Marimon and Sunder1993). In each period, participants submit forecasts for prices in multiple future periods all at once. These forecasts determine not only participants’ payoffs but also the realized aggregate prices. We set up our game so that participants merely need to anticipate aggregate prices accurately and their payoffs are maximized when they submit accurate forecasts in each period, regardless of whether the game features strategic substitutes or complements. We formally show the equivalence between this transformed experiment and the original price-setting game in Appendix A.

We adopt a three-by-two between-subjects experimental design that varies two key dimensions. The first dimension pertains to the structure of strategic interaction among participants, contrasting conditions of strong strategic complementarity with those involving equally strong or greater degrees of substitution. The second dimension concerns the timing of information about upward shifts in the demand curve, which we call shocks. In some treatments, participants face shocks that occur unexpectedly. Such shocks are referred to as unanticipated shocks. In others, they are announced two periods in advance. Such announcements of future shocks are termed anticipated shocks. This experimental design allows us to investigate both how the strategic environment affects behavior and whether participants form forward-looking expectations in response to anticipated shocks.

We begin by examining the strategic environment effect, which posits that prices deviate substantially from REE prices—that is, prices expected under REE—when the game features strategic complements, whereas prices converge quickly to REE prices otherwise. Re-examination of this well-established strategic environment effect in the literature serves as a sanity check to ensure that our experimental framework produces results that are aligned with those of existing one-period learning-to-forecast games (see, e.g.,Bao et al., Reference Bao, Hommes, Sonnemans and Tuinstra2012; Bao et al., Reference Bao, Hommes and Makarewicz2017; Bao & Hommes, Reference Bao and Hommes2019; Heemeijer et al., Reference Heemeijer, Hommes, Sonnemans and Tuinstra2009).Footnote ⁵ We find that the strategic environment effect is replicated in our new experimental framework, thereby validating the internal consistency of our setup before we move on to more substantive analyses.

Having reproduced the strategic environment effect in our experimental framework, we proceed to investigate the forward-looking nature of expectation formation. In particular, we examine how participants respond to anticipated shocks; namely, the announcement of future shocks. Recall that the key innovation of our design is that participants submit a sequence of forecasts extending into future periods, including those after an announced shock has occurred. For example, when a shock scheduled two periods ahead is announced in advance, we can analyze how participants revise their forecasts for periods after the realization of the shock following the announcement by comparing the forecasts for the same future periods before and after the announcement. This feature highlights the novelty of our approach; rather than relying solely on forecasts made for the immediate next period, we explicitly capture adjustments that unfold after anticipated shocks have materialized.

Our experimental data reveal significant heterogeneity in how participants respond to the anticipated shocks. While certain individuals forecast price levels that surpass those implied by the REE, others completely ignore anticipated shocks. The latter response is consistent with level-0 reasoning. This result is similar to the one based on the responses to the unanticipated shocks reported by Evans et al. (Reference Evans, Gibbs and McGough2025) in their single-period setting.

Although the heterogeneity in behavior is noteworthy in its own right, our experimental framework permits us to direct attention to the average responses of participants and to evaluate whether these are consistent with forward-looking behavior.Footnote ⁶ Our analysis demonstrates that, on average, participants do respond to the anticipated shocks in line with forward-looking behavior. This finding provides new evidence of forward-lookingness in expectation formation, offering a distinctive contribution to the literature.

After examining participants’ responses to anticipated shocks, we turn to the formation of expectations during periods that lack such shocks. That is, we focus on “normal” times to study belief formation in a dynamic environment. Following Anufriev and Hommes (Reference Anufriev and Hommes2012), we estimate a reduced-form forecasting rule that encompasses various heuristics like adaptive expectations and trend following. We generalize their approach by allowing coefficients of these heuristics to vary across different forecast horizons.

The regression results show that most participants rely heavily on the most recent price as a reference point, with coefficients close to one, and they also consider their past forecast errors, indicating self-referential expectations. Notably, the impact of recent price changes becomes stronger as the forecast horizon lengthens. This effect is particularly strong in positive feedback environments, where participants increasingly rely on trend-following behavior. This finding suggests that in positive feedback environments (i.e., those with strategic complementarity), participants expect price increases to persist, potentially leading to de-anchored long-run inflation expectations. Conversely, in negative feedback settings (i.e., those with strategic substitutability), participants view price increases as temporary, and their expectations remain relatively stable across horizons.

Taken together, these findings indicate an important policy implication: expectation formation depends qualitatively on whether the strategic environment features complementarity or substitutability. As a result, central banks must understand the strategic context in which agents operates to effectively conduct monetary policy and mitigate the risks of expectation de-anchoring.

The rest of the paper is organized as follows. Section 2 presents a model of a multi-period beauty contest based on Calvo (Reference Calvo1983). The experiment’s design and procedure are presented in Section 3, while Section 4 shows the REE as the benchmark. The results of the experiment are summarized in Section 5, including a discussion of implications for modeling expectation formation. Section 6 concludes.

2. A model of a multi-period beauty contest

We introduce our theoretical model and derive the equilibrium price under the REE of the model. The theoretical model serves as the foundation for our experiment design, and the REE price offers a benchmark for comparison in the subsequent analysis.

Consider a continuum of monopolistically competitive firms uniformly distributed over the interval $\left[0,1\right]$. Each firm $i$ selects a price $p_{i}$; for simplicity, we omit the subscript $i$. The demand function for an individual firm is

(1)

\begin{equation} D\left(p;P\right)\equiv\left[a-bp+cP\right]^{+}, \end{equation}

where $P$ represents the aggregate price given by

(2)

\begin{equation} P=\int_{0}^{1}p_{i}di. \end{equation}

We assume that $a \gt 0$, $b \gt 0$, and $c\in\mathbb{R}$, with the condition

\begin{equation*} -\infty \lt c\leq2b, \end{equation*}

which ensures the existence of an REE price in the single-period version of the game. The parameter $c$ can be positive or negative, which governs the degree of strategic interaction among firms. All firms share an identical linear technology function with a unit production cost denoted by $\kappa.$

Each firm maximizes the present value of its profit but faces pricing frictions; firms cannot adjust their prices every period and can only do so with probability $1 - \theta$. If a firm cannot change its price for $T$ consecutive periods, it is allowed to adjust its price with certainty in the next period.Footnote ⁷ Additionally, with probability $\gamma \in [0,1)$, all firms may be forced to exit the market. This assumption enables us to conduct experiments within a finite period that effectively replicate those of an infinite period (Duffy, Reference Duffy, Kagel and Roth2017).

Firms that can reset their prices in period $t$ solve the following optimization problem:

(3)

\begin{equation} \max_{p_{t}} \sum_{s=0}^{T-1}((1-\gamma)\theta)^{s}\left(p_{t}-\kappa \right)D\left(p_{t};P_{t+s}\right). \end{equation}

Recall that in every period, firm $i$ can re-optimize its price with probability $1-\theta$. Since the continuation payoffs in these future events are not a function of $p_{t},$ they do not appear in objective function (3).

We now formally define an REE as a pair of prices $(p_t,P_t)$ such that given the aggregate price $(P_t)_t$, $p_t$ solves the maximization problem (Eq. 3), and consistency, $P_t = p_t$, holds for all $t.$

We establish the following proposition:

Proposition 1. The optimal price for firms that can reset their prices at period $t$ is given by

(4)

\begin{equation} p_{t}=\sum_{s=0}^{T-1}\frac{((1-\gamma)\theta)^{s}}{\sum_{k=0}^{T-1}((1-\gamma)\theta)^{k}}\left(\alpha+\beta P_{t+s}\right), \end{equation}

where

\begin{equation*} \alpha=\frac{1}{2}\left(\kappa+\frac{a}{b}\right)\quad\beta=\frac{1}{2}\frac{c}{b}. \end{equation*}

Moreover, the aggregate REE price evolves according to

(5)

\begin{equation} P_{t} =\left(1-\theta\right)p_{t}+\theta P_{t-1}. \end{equation}

Proof. Taking the first-order conditions of the maximization problem in Eq. (3), we obtain Eq. (4). To derive Eq. (5), recall that the aggregate price in Eq. (2) is given by the average of individual prices and that only the fraction $1-\theta$ of firms can reset their prices. Because firms that cannot reset their prices today are randomly chosen, the average price among them is $P_{t-1}$. Firms that can reset their prices choose the same price level given by Eq. (4), so the average price $P_t$ satisfies Eq. (5).

Notably, $\alpha+\beta P_{t}$ represents the optimal price in period $t$ if firms can reset their prices freely. This price is often referred to as the optimal flexible price. Therefore, the optimal price in our environment, as shown in Eq. (4), constitutes a weighted average of optimal flexible prices. Eq. (4) captures the dynamic thought process within firms; the optimal price, $p_{t}$, depends on current actions by others, and on future actions, represented by $\left(P_{t+s}\right)_{s=0}^{T-1}$. Because of this inter-period interdependence, today’s aggregate price is also influenced by expectations of future prices. This feature leads to the model being referred to as a dynamic beauty contest model (e.g., Angeletos & Lian, Reference Angeletos and Lian2018).

By contrast, when firms can adjust prices without friction, $\theta = 0$, the dynamic aspect disappears, and firms focus solely on predicting other firms’ current actions. Our proposed experiment, inspired by this model, differs from existing single-period experiments by explicitly requiring participants to anticipate not only others’ actions in period $t$ but also in subsequent periods $t+1, \ldots, t+T-1$.

3. Experimental design

In this section, we describe the experimental setup by detailing the structure of the learning-to-forecast game that participants actually play. We then outline our treatment design, followed by an explanation of the online experimental procedure.

3.1. Setup

In our experimental setup, we employ groups of six participants.Footnote ⁸ These participants, indexed by $i$, engage in the game over multiple periods. At the beginning of each period, they submit their price forecasts for the next five periods including the current one; that is, $T=5$.Footnote ⁹ For example, in the first period, participants submit their price forecasts for periods 1 to 5. In the second period, they provide forecasts for periods 2 to 6, and so on. We denote the forecast of the price in period $k$ submitted in period $t$ as $f_{t,k}^{i}$. While not all submitted forecasts impact participants’ rewards, as detailed below, all five forecasts have the potential to do so when submitted.Footnote ¹⁰

Let $\pi_{t}^{i}$ denote the reward of participant $i$ in period $t$, as determined by

\begin{equation*} \pi_{t}^{i}=\frac{100}{\left|F_{t}^{i}-P_{t}\right|+1}, \end{equation*}

where $P_{t}$ is the realized price in period $t$, and $F_{t}^{i}$ is the payoff-relevant forecast of participant $i$ in period $t$, as defined below.Footnote ¹¹ If the reward is not an integer, then it is rounded to the nearest integer.

In the first period $t$, the payoff-relevant forecast $F_{t}^{i}$ is set to $F_{1}^{i}=f_{1,1}^{i}.$ In period $2$, the payoff-relevant forecast $F_2^i$ is determined probabilistically: with probability $1-\theta,$ the newly submitted set of forecasts become payoff-relevant, yielding $F_{2}^{i}=f_{2,2}^{i};$ with probability $\theta,$ the previous forecasts remain payoff-relevant, resulting in $F_{2}^{i}=f_{1,2}^{i}$. This process continues in subsequent periods. If the same set of forecasts has been payoff-relevant for five consecutive periods (i.e., a firm cannot reset its price for five consecutive periods), then the new set of forecasts submitted in the next period becomes payoff-relevant with certainty.

This adjustment process is motivated by the multi-period beauty contest model introduced in Section 2. The probability of firms being given an opportunity to re-optimize their prices in period $t$, $1-\theta$, is translated into the probability of the new set of forecasts becoming payoff-relevant for participants. The horizon, $T$, over which firms optimize in the model is equivalent to the number of future periods in addition to the current one over which our participants forecast in each period. As in Marimon and Sunder (Reference Marimon and Sunder1993) and Bao et al. (Reference Bao, Hommes and Makarewicz2017), we set the payoffs so that the equilibrium path of the Nash equilibrium (subgame perfect Nash equilibrium) corresponds to the REE of the multi-period beauty contest game in Section 2. See Appendix A for a formal proof.

To illustrate how submitted forecasts determine the payoff-relevant forecasts, consider the hypothetical forecasts of participant $i$ shown in Table 1. Each row lists the forecasts submitted in period $t$. For example, in period $t=1,$ the participant submits forecasts of the prices for periods $1$ to $5$:

\begin{equation*} \left(f_{1,1}^{i},f_{1,2}^{i},f_{1,3}^{i},f_{1,4}^{i},f_{1,5}^{i}\right)=\left(10,11,12,12,12\right). \end{equation*}

Table 1.

Hypothetical submitted forecasts

In period 1, this set of forecasts becomes payoff-relevant so that the payoff-relevant forecast is $F_{1}^i=10$. In period $2,$ the participant submits new forecasts $f^i_{2,2},\cdots, f^i_{2,6}$. Suppose that the new forecasts do not become payoff-relevant in period 2. Then, $F_2^i = f_{1,2}^i =11$. If the new set of forecasts becomes payoff-relevant in periods 3 and 4, then the latest forecasts determine the payoff-relevant forecasts. Thus, $F_{3}^{i}= f_{3,3}^i= 9$ and $F_{4}^{i} = f_{4,4}^i = 13$.Footnote ¹² The reward for the participant is determined by the difference between the payoff-relevant forecasts $F_t^i$ and the aggregate price $P_t$.

We now explain how the aggregate price is determined in our experiments; all participants submit their forecasts simultaneously every period, and these submitted forecasts jointly determine the aggregate price $P_{t}$. Based on the model in Section 2, the aggregate price is given by

(6)

\begin{equation} P_{t}=\frac{1}{6}\left(\sum_{i\,\text{cannot reset}} p^i_{t-1} + \sum_{i\,\text{can reset}}\sum_{j=0}^{T-1}\frac{((1-\gamma) \theta)^{j}}{\sum_{l=0}^{T-1}((1-\gamma) \theta)^{l}}\left(\alpha+\beta f_{t,t+j}^{i}\right) \right), \end{equation}

where $\alpha$ and $\beta$ are parameters of the model and specified later. Eq. (6) is the empirical-counterpart of Eq. (2). The first term on the right-hand side of Eq. (6) represents the prices of participants who are unable to adjust their prices. The second term corresponds to the prices of participants who can reset their prices. The mapping of submitted forecasts to the optimal price is given by Eq. (4).

On the screen in which participants submit their forecasts, the values of $\alpha$ and $\beta$ are presented clearly. On the same screen, participants are informed of the realized $P_t$ and the payoff-relevant forecast $F_t$ in all past periods. See Appendix G for the screenshots.

In our experiment, we set the reset probability $\theta$ to $1/2$ and assume that the game ends with probability $\gamma=0.05$ at the end of each period.Footnote ¹³ Participants are rewarded based on the total points they earn throughout the game. As noted above, we set $T=5.$ Eq. (6) can be used to provide a justification for $T=5.$ Note that when $\theta = 1/2$ and $\gamma=0.05$, the impact of $f^i_{t,t+4}$ on the optimal price is minimal. This is because the weight of $f^i_{t,t+4}$ is $((1-\gamma)\theta)^4/\sum_{k=0}^4((1-\gamma)\theta)^k$, which is approximately 0.027. Therefore, allowing participants to make longer forecasts is unlikely to change the results.

While it is conceptually trivial to end a game stochastically, implementing such a probabilistic termination rule in a laboratory experiment poses challenges, either because the game might end too soon to study the evolution of the forecasts in response to shocks, or the game might not end within the scheduled time for participants. To address these challenges, we use the block random termination method (Fréchette & Yuksel, Reference Fréchette and Yuksel2017) commonly used in experiments involving indefinitely repeated games.

Under this method, participants play the game in blocks of $B$ periods. During each block, the game proceeds without participants knowing whether or not it has ended. Only at the end of a block are participants informed if the game actually ended at some point during that block.Footnote ¹⁴ If the game has ended during the block, they receive the sum of their payoffs $\pi_{t}^{i}$ up to the period when the game ended. For example, if the game actually ended in period $\tau$ where $\tau \lt B$, participants are rewarded based on $\sum_{t=1}^{\tau}\pi_{t}^{i}$. If the game has not ended, it continues into the next block of $B$ periods. They are also informed that the game can continue beyond $B$ periods; if that happens, they play the game for at least another $B$ periods. In our experiment, we set $B = 20$, so participants are told they will play the game for at least 20 periods.

To minimize variation in experiment duration and participant payments across sessions, we followed the procedure of Duffy and Puzzello (Reference Duffy and Puzzello2014); Duffy and Puzzello (Reference Duffy and Puzzello2022) by predefining the random number sequence. Predefining ensured that all sessions repeated two blocks and concluded after the first game. Consistent with Duffy and Puzzello (Reference Duffy and Puzzello2014); Duffy and Puzzello (Reference Duffy and Puzzello2022), this information was not disclosed to participants.Footnote ¹⁵

During the games, we introduce (a maximum of) two shocks to $\alpha$, which affects the demand size, during the first two blocks of 20 periods (one shock in each block). The literature suggests that under strategic substitution, where prices often converge to the steady state level, it may take several periods to do so. To allow for the prices to stabilize before introducing a shock, we introduced the first shock at the beginning of period 14 and the second at period 29. We assume that the initial flexible REE price is 65; it becomes 85 and 110 after the first and the second shocks, respectively. We choose $\alpha$ to match these price levels for each value of $\beta.$

3.2. Treatments

Our three-by-two between-participants experiment focuses on varying two main aspects of the games. The first is the degree of strategic interaction. We consider games where $\beta$ takes values in $\left\{0.9,-0.9,-1.8\right\}$; that is, the games exhibit either strategic complementarity (positive feedback) or substitution (negative feedback).Footnote ¹⁶ We also explicitly consider a case of strong substitution with $\beta=-1.8$. This consideration is motivated by the fact that although New Keynesian models may exhibit strong substitutability, as demonstrated by García-Schmidt and Woodford (Reference García-Schmidt and Woodford2019), the issue has not been investigated to a greater extent in the experimental literature, with exceptions such as Bao and Duffy (Reference Bao and Duffy2016) and Evans et al. (Reference Evans, Gibbs and McGough2025).

The second is the announcement of the shocks to $\alpha$. We examine treatments both with and without pre-announcement of shocks. In the treatment without pre-announcement, participants are informed of the new value of $\alpha$ only when the shocks occur—that is, in period 14 for the first shock and in period 29 for the second shock. In the treatments with pre-announcement, participants are informed of the new value of $\alpha$ two periods before its realization, that is, in period 12 for the first shock and in period 27 for the second shock. This announcement specification is desirable because announcing the shocks two periods in advance allows us to examine participants’ forecasts for prices after the realization of shocks, both before and after the announcements.Footnote ¹⁷ Because we elicit participants’ forecasts only for five future periods, if the announcement is made either too early in advance, it becomes impossible to analyze its immediate impact on forecast revisions at the time the shock occurs. Therefore, we make the announcement two periods prior to the occurrence of the shock and analyze how forecasts are revised, the question that lies at the core of our study.

3.3. Procedures

Our experiments were conducted online using oTree (Chen et al., Reference Chen, Schonger and Wicken2016), an open source platform for web-based interactive tasks. Participants joined from their own locations instead of our physical laboratory. We used Zoom to manage and coordinate the experiments.Footnote ¹⁸

Once participants had received general instructions about the online experiment and were prepared, the prerecorded instruction video was shown on their screen. Although they did not receive physical copies of the instruction slides, participants were informed that they could access the same slides after the video finished, and until they finished the comprehension quiz. To begin the first game, all participants had to correctly answer all six quiz questions.Footnote ¹⁹ Final rewards were provided through Amazon Gift Cards (e-mail version).

We recruited participants, who were students at the University of Osaka, using ORSEE (Greiner, Reference Greiner2015). An English translation of the instruction slides, examples of the decision screens, and the comprehension quiz are provided in Appendix F.

4. Benchmark analysis: rational expectations equilibrium

We take rational expectations as our benchmark. As shown in Proposition 1, the REE aggregate price denoted by $\left(P^{REE}_{t}\right)$ is characterized by Eqs. (4) and (5) given the initial condition, $P_{-1}$. Unlike in the theoretical analysis, parameter $\alpha$ is time-varying in our experiment. Thus, we need to generalize Proposition 1 to accommodate this case. It is straightforward to show that the REE price with time-varying $\alpha$ satisfies the following equation:

(7)

\begin{equation} P^{REE}_{t} =\left(1-\theta\right)\sum_{k=t}^{T-t-1}\frac{((1-\gamma)\theta)^{k-t}}{\sum_{s=0}^{T}((1-\gamma)\theta)^{s}}\left(\alpha_{t,k}+\beta P^{REE}_{t+k}\right)+\theta P^{REE}_{t-1}, \end{equation}

where $\alpha_{t,k}$ is the expected value of $\alpha$ in period $k$ from the period- $t$ perspective. When the shocks are not announced in advance, $\alpha_{t,k}=\alpha_{t,t}$ for all $k$ and $t.$ When the shocks occur, $\alpha_{t,t}$ suddenly increases. Suppose that the shocks are pre-announced. In that case, $\alpha_{t,k}$ changes when the shocks are announced, not when they occur. Again, the pre-announcement of the future shocks, anticipated shocks, matters only if the game is a multi-period beauty contest game. If firms do not face any pricing frictions $\theta=0$, then the anticipated shocks have no effect on the REE price.

Figure 1 shows the REE price sequence $\left(P^{REE}_{t}\right)_{t}$ when the shocks are announced two periods in advance (left panel) and when they are not announced (right panel). These figures can be intuitively understood. When the game demonstrates strategic complementarity ( $\beta \gt 0$), the transition to new steady state equilibrium prices is slower due to Calvo pricing friction. Participants recognize that others may not adjust their prices swiftly because of that pricing friction. Since individual optimal prices are positively related to others’ pricing decisions, participants prefer to adjust their prices slowly.

Fig 1.

Aggregate price $P_{t}$ under rational expectation

Note: The solid vertical lines indicate when the two shocks occur, while the dotted lines in the left panel show when these shocks are announced.

By contrast, when the game exhibits strategic substitution ( $\beta \lt 0$), the mechanism operates in the opposite direction. If some agents fail to adjust their prices, the prevailing price becomes excessively low. This low price incentivizes agents to increase their prices. Consequently, the transition to the new steady state equilibrium price occurs more rapidly.

5. Analysis of the experimental data

In this section, we begin by outlining the experimental sessions, including participant numbers, treatment assignments, and earnings. Then, we proceed to analyze our experimental data.

5.1. Overview of the experimental sessions

We conducted our experiments in April and May 2023, involving a total of 294 participants.Footnote ²⁰ Table 2 summarizes the number of groups per treatment. Each experiment lasted for approximately 90 minutes, and participants earned 2482 JPY (approximately 18 USD based on the exchange rate at the time), including a show-up fee of 500 JPY on average.Footnote ²¹ The average payment varied across the value of $\beta$. It was lowest in the treatments with $\beta=-1.8$ (1778 JPY), followed by 2806 JPY and 2844 JPY in treatments with $\beta=-0.9$ and $\beta=0.9$, respectively.

Table 2.

Number of groups per treatment

Notes: “With Announcement” represents the results where the shocks were announced, while “Without Announcement” shows the results where the shocks were not pre-announced. $a$: One group stopped at $t=33$ due to a technical problem. $b$: One group stopped at $t=39$ due to a technical problem.

5.2. Aggregate price dynamics

We begin by presenting the dynamics of prices observed in each treatment in Figure 2. Each line represents a group within each panel. As observed, irrespective of whether an announcement is made, the prices follow the REE price closely when the game exhibits strategic substitutes $\beta \lt 0.$ When the game exhibits strategic complements $\beta \gt 0$, they deviate persistently from the REE prices. As shown by Fehr and Tyran (Reference Fehr and Tyran2008), these features can be understood intuitively. When the game exhibits strategic complements, the best response function has a positive slope, and participants have a strong motive to choose a similar price level of others. Consequently, the realized price remains close to the initial expectation of others’ actions, and the initial expectation fulfills itself. This self-fulfilling mechanism makes the adjustment slow, leading to persistent deviations from the REE price. When the game exhibits strategic substitutes, this mechanism does not operate. Participants want to choose higher (lower) prices when others choose lower (higher) prices. Thus, their initial expectations are not self-fulfilling unless they coincide with the REE price, and they often quickly converge toward the REE price. Note that our experimental findings are also found in existing learning-to-forecast experiments, such as those by Heemeijer et al. (Reference Heemeijer, Hommes, Sonnemans and Tuinstra2009), Bao et al. (Reference Bao, Hommes, Sonnemans and Tuinstra2012); Bao et al. (Reference Bao, Hommes and Makarewicz2017), and Bao and Hommes (Reference Bao and Hommes2019).

Fig 2.

Realized aggregate prices $P_{t}$ (a) Positive Feedback: $\beta=0.9$ (b) Weak Negative Feedback: $\beta=-0.9$ (c) Strong Negative Feedback: $\beta=-1.8$

Notes: The red lines represents the aggregate price under the rational expectation. The solid vertical lines indicate when the two shocks occur, while the dotted vertical lines in the left panel show when these shocks are announced.

It is evident that aggregate prices become more stable when the game has weak strategic substitutes ( $\beta = -0.9$). By contrast, with strong strategic substitutes ( $\beta = -1.8$), aggregate prices hover around the REE price but exhibit greater deviations. A similar pattern is also documented by Bao and Duffy (Reference Bao and Duffy2016) and Evans et al. (Reference Evans, Gibbs and McGough2025). This result likely stems from difficulties in expectation coordination. When $\beta$ becomes more negative, the price depends sensitively on expectations. This sensitivity results in the observed instability of prices.Footnote ²²

5.3. Effect of the degree of strategic interaction

We proceed to quantify the degree of deviation from the REE prices by following the methodology developed by Stöckl et al. (Reference Stöckl, Huber and Kirchler2010), which allows us to verify that our results align with evidence from existing one-period learning-to-forecast games. We compute the relative absolute deviation (RAD) and the relative deviation (RD) as proposed in their study. For each group $g$, $RAD_{g}$ and $RD_{g}$ are calculated as follows:

\begin{align*} RAD_{g} & =\frac{1}{K}\sum_{t}\frac{\left|P_{g,t}-P_{t}^{REE}\right|}{P_{t}^{REE}}\\ RD_{g} & =\frac{1}{K}\sum_{t}\frac{P_{g,t}-P_{t}^{REE}}{P_{t}^{REE}}, \end{align*}

where $P_{g,t}$ is the realized period $t$ price for group $g$, $P_{t}^{\text{REE}}$ is the REE price in period $t$, and $K$ represents the total number of periods (40 except for the two groups that faced a technical problem).Footnote ²³

Figure 3 shows the empirical cumulative distributions of RADs (top) and RDs (bottom) in the treatments with (left) and without (right) an announcement. In each panel, the distributions for each $\beta$ are shown. The top panels show that regardless of the existence of an announcement, RADs are positive for each $\beta$. Applying the signed-rank test to the distributions of RADs, the observed prices are significantly different from the REE prices.Footnote ²⁴ Furthermore, the distributions of RADs are ranked in terms of first-order stochastic dominance; regardless of whether an announcement was made, the distribution under $\beta = 0.9$ stochastically dominates those under $\beta = -1.8$ and $\beta = -0.9$. This indicates a higher likelihood of larger deviations from the REE prices under $\beta = 0.9$ than with other treatments, and the differences between the three treatments are statistically significant both with and without an announcement.Footnote ²⁵ The deviations from the REE prices are smallest under weak substitution ( $\beta = -0.9$), larger under strong substitution ( $\beta = -1.8$), and largest under strategic complementarity ( $\beta = 0.9$).

Fig 3.

Deviations from the rational expectation prices (a) Relative Absolute Deviation (RAD) (b) Relative Deviation (RD)

Note: p-values of Kruskal-Wallis test are reported.

The bottom panels in Figure 3 build on these findings by showing that RDs are also larger under strategic complementarity than under strategic substitution, mirroring the pattern observed for RADs. However, in contrast to the results for RADs, there is no longer a statistically significant difference between $\beta = -0.9$ and $\beta = -1.8$.Footnote ²⁶ The lack of statistical significance comes from the fact that RDs, by incorporating the direction of deviation from the REE price, become nearly zero when prices fluctuate around the REE price, as in the case where $\beta = -1.8$.Footnote ²⁷

5.4. Forward-lookingness of expectations

We now examine whether forecasts have responded to anticipated shocks: that is, the announcement of future shocks.Footnote ²⁸ To accomplish this objective, we analyze forecast revisions before and after anticipated shocks to assess whether individuals have reacted to the shocks or disregarded them. In particular, if there are no revisions to forecasts after an anticipated shock, it suggests that the anticipated shock is ignored.Footnote ²⁹

To operationalize this analysis, we define our measures of forecast revisions as follows. We focus on the forecasts for changes in prices during periods 14 and 29, $P_{14}-P_{13}$ and $P_{29}-P_{28}$, and analyze how these forecasts are revised before and after the associated shock announcements. These revisions are mathematically represented as

(8)

\begin{equation} \Delta f^i_{1} = E^i\left[P_{14}-P_{13}\mid t\leq13\right]-E^i\left[P_{14}-P_{13}\mid t\leq11\right], \mbox{and } \end{equation}

(9)

\begin{equation} \Delta f^i_{2} = E^i\left[P_{29}-P_{28}\mid t\leq28\right]-E^i\left[P_{29}-P_{28}\mid t\leq26\right], \end{equation}

where the conditional expectations are taken over the relevant information sets. For example, $E^i\left[P_{14}-P_{13}\mid t\leq13\right]$ represents the expected value of the change of the price in period 14 conditional on all information available up to period 13. Thus, the differences in the conditional expectations, $\Delta f^i_{1}$ and $\Delta f^i_{2}$, capture the revisions of the forecasts of price changes $P_{14}-P_{13}$ and $P_{29}-P_{28}$ in response to the announcements in period 12 and 27, respectively.Footnote ³⁰ The conditional mean is calculated based on information available through periods 13 and 28 instead of periods 12 and 27, to mitigate noise by including two post-announcement observations.

By examining whether the values of these revisions are zero or not, we can infer that individuals are responding to the anticipated shocks. In particular, observing nonzero revisions strongly suggests that participants do indeed react to the announcements. At the same time, because our analysis focuses on revisions tied to price changes, there is another way to interpret nonzero revisions: announcements exert a greater influence on forecasts for more distant future prices relative to the near term.Footnote ³¹ However, it is important to emphasize that identifying the forward-lookingness through these revisions is a sufficient but not strictly necessary condition. For example, even if participant $i$ revises her forecasts of both $P_{14}$ and $P_{13}$ by 10 points each in response to the first announcement, $\Delta f_1^i$ remains at zero. Thus, according to our measure, this implies that she does not respond to the announcement. This example highlights that our inference based on the revisions provides a conservative gauge of forward-lookingness. We intentionally adopt this conservative criterion to reduce the likelihood of falsely concluding that participants are forward looking when they are not, thereby ensuring a more robust measure of genuine forward-looking behavior.

To calculate these revisions for each participant $i$, we take advantage of our experimental design. We measure the forecasts for the price change in period 14, conditional on the information available by period 13, as $\sum_{t=12}^{13}\left(f_{t,14}^{i}-f_{t,13}^{i}\right)/2$. Recall that $f_{t,14}^{i}-f_{t,13}^{i}$ represents the forecast of $P_{14}-P_{13}$ in period $t$. Hence, this average corresponds to the conditional expectation after the first shock announcement, if any. Similarly, we measure the conditional expectation before the first shock announcement, if any, as $\sum_{t=10}^{11}\left(f_{t,14}^{i}-f_{t,13}^{i}\right)/2$. The difference of these averages corresponds to $\Delta f^i_1$. We define $\Delta f^i_2$ in an analogous way. Note that our multi-period experimental design uniquely allows us to measure these revisions. In the majority of existing papers, these revision measures are simply unavailable since sequences of forecasts are not collected.

To analyze the distributions of $ \Delta f_{1}^{i} $ and $ \Delta f_{2}^{i} $, it is useful to compare them with their theoretical values under the REE. When the shocks are not announced, their theoretical values are easily obtained. Under the REE hypothesis, the entire path of the prices $ P_t $ is rationally expected at the beginning of the game. Thus, no additional information becomes available over time, and forecasts are not revised at all, and both $ \Delta f^i_{1} $ and $ \Delta f^i_{2} $ are zero in this case.

Consider the case where shocks are announced. Since those announcements constitute new information, they trigger revisions of the forecasts. To compute $ \Delta f^i_{1} $ under the REE, we solve our model in Section 2 in two cases. We solve the model without any shocks and then solve it again under the assumption that the first shock is announced. We then compute the difference, $ P_{14} - P_{13} $, for both cases. Finally, we subtract the difference with no shocks from the difference with the first shock announcement. This double difference corresponds to the theoretical counterpart of $ \Delta f^i_1 $. We compute the theoretical value of $ \Delta f^i_2 $, using the same procedure.

We begin our analysis by considering the scenario where shocks are not announced. This analysis serves as a sanity check for our experimental setup since we naturally expect participants not to revise their forecasts in the absence of new information. Figure 4 presents the histograms for $\Delta f_{1}^{i}$ and $\Delta f_{2}^{i}$ for various $\beta$ values. The figure indicates that both $\Delta f_{1}^{i}$ and $\Delta f_{2}^{i}$ for all $\beta$ are centered around zero, with their median values exactly equal to zero. To be specific, the fractions of individuals who do not revise their forecasts are greater than $40\%$. As discussed above, this result aligns with the model’s prediction under the REE; most participants do not revise their forecasts unless new information arrives. It is important to note that while our finding constitutes a necessary condition for participants to hold rational expectations, it does not suffice on its own. For example, if participants strongly believe that the environment is stationary and would assume that $P_t = P_{t+1}=\cdots$, that leads to $E^i(P_{t+1}-P_{t}|P_{\tau \lt t})=0$. In such cases, even though participants do not form their expectations based on rational expectations, $\Delta f_{1}^{i}$ and $\Delta f_{2}^{i}$ still equal zero. We examine how each participant forms her belief over the price levels in Section 5.5.

Fig 4.

Forward-lookingness of expectations: case without announcement (a) Distributions of Forecast Revisions After the First Announcement $\Delta f_{1}^{i}$ (b) Distributions of Forecast Revisions After the Second Announcement $\Delta f_{2}^{i}$

Next, we analyze the case where shocks are announced, as depicted in the histograms in Figure 5. Three important observations emerge from this analysis. First, the histograms appear right-skewed, and the median values have become positive, indicating that some participants respond significantly to the announcements.

Fig 5.

Forward-lookingness of expectations: case with announcement (a) Distributions of Forecast Revisions After the First Announcement $\Delta f_{1}^{i}$ (b) Distributions of Forecast Revisions After the Second Announcement $\Delta f_{2}^{i}$

Second, there is considerable heterogeneity in participants’ responses. To examine this heterogeneity, we compute for each treatment the fraction of participants who did not revise their forecasts ( $\Delta f_{j}^{i} = 0$), those who revised their forecasts to a greater extent than predicted by the REE model, and others. As shown in Figure 5, about a quarter of participants did not revise their forecasts, implying that they ignored the announcement when forecasting price change. However, a significant fraction of participants adjust their forecasts upward, accounting for the announcements. Notably, some even overreact by revising their forecasts more than the REE model predicts. For example, after the announcement of the second shock, $33\%$ of participants overreacted when $\beta = -0.9$. This finding suggests strong forward-looking behavior.Footnote ³²

Another way to classify revisions, $\Delta f_{j}^{i}$, is to apply level- $k$ reasoning (Nagel, Reference Nagel1995). The level- $k$ reasoning represents a form of forward-looking expectation formation and is often employed to analyze situations in which agents must make forecasts after unprecedented shocks (Farhi & Werning, Reference Farhi and Werning2019). Following Evans et al. (Reference Evans, Gibbs and McGough2025), we derive the theoretical counterparts of $\Delta f_{j}^{i}$ under level- $k$ forecasting.Footnote ³³ Let the default forecast sequence be ${P_{t+s}(0)}_{s=0}^\infty$. The level- $k$ forecast sequence ${P_{t+s}(k)}_{s=0}^\infty$ is then defined recursively by

(10)

\begin{equation} P_{t}(k) = (1-\theta)\sum_{s=0}^{T-1}\frac{\bigl((1-\gamma)\theta\bigr)^{s}}{\sum_{m=0}^{T-1}\bigl((1-\gamma)\theta\bigr)^{m}}\bigl(\alpha+\beta P_{t+s}(k-1)\bigr) +\theta P_{t-1}(k-1). \end{equation}

We define the default forecast sequence $P_{t+s}(0)$ as the average of the realized prices in the two periods immediately preceding the forecast date. For example, when computing the level- $k$ theoretical forecast

\begin{equation*} E^i\bigl[P_{14}-P_{13}\mid t\leq13\bigr], \end{equation*}

we set

\begin{equation*} P_{t+s}(0)=\frac{P_{12}+P_{13}}{2} \quad\text{for all }s\ge0. \end{equation*}

We then assign each participant to the level whose theoretical forecast lies closest to her observed revision. Participants whose revisions deviate by more than three units from all level-0, level-1, level-2, and REE forecasts are categorized as “Others.” The cutoff of three follows Evans et al. (Reference Evans, Gibbs and McGough2025). Figure 6 shows the results.

Fig 6.

Classification of participants based on depth of reasoning

As expected from Figure 5, large shares of participants are classified as level-0 forecasters since they do not revise their forecasts at all. Moreover, sizable shares fall into each of the level-0 through level-3 and REE categories even in our setting. These results mirror those reported by Evans et al. (Reference Evans, Gibbs and McGough2025). One notable quantitative divergence is that the share of participants whose forecasts conform to the REE benchmark is smaller here than in Evans et al. (Reference Evans, Gibbs and McGough2025). In our setup, only about 5% and 12% of participants are classified as REE forecasters after the first and second announcements, respectively, compared with 25% and 38% in Evans et al. (Reference Evans, Gibbs and McGough2025). This discrepancy likely reflects the greater difficulty of constructing REE forecasts in our dynamic environment, where agents must account for the transition path to the new steady state.Footnote ³⁴

Finally, we investigate whether participants begin to forecast the REE price more accurately following the announcements. We do so by computing the absolute difference between $\Delta f_{j}^{i}$ and the REE benchmark for each participant and treatment. We then evaluate whether the median of these participant-level deviations declines significantly after the second announcement within each treatment group. Although Figure 5 indicates the possibility of learning in some treatments (notably for $\beta = 0.9$), the observed deviations are not statistically significantly smaller for the second announcement than the first one. Accordingly, we fail to find evidence of learning in response to announcements.Footnote ³⁵

While accounting for the substantial heterogeneity in participants’ responses is important, it is equally critical to assess whether, on average, participants react to the announcement. This is relevant because in our linear demand system, the aggregate price is governed by the average forecast. To this end, we analyze whether the mean values of $(\Delta f^i_J)_{J=1,2}$ differ between conditions with and without the announcement, using the Mann-Whitney U test. Furthermore, we investigate whether these values are drawn from the same distribution by applying the Kolmogorov-Smirnov test. We hypothesize that the announcement causes a rightward shift in the distribution of $\Delta f^i_J$; therefore, we perform these tests with a one-sided alternative hypothesis.

Figure 7 presents the empirical distributions of $\Delta f_{1}^{i}$ and $\Delta f_{2}^{i}$ for both announced and unannounced cases, along with the associated $p$-values from the tests. As shown in that figure, the distributions tend to shift to the right, except in the case of $\beta = 0.9$ and $J=2$. The statistical tests confirm these visual patterns. Specifically, the distribution of $\Delta f^i_J$ with an announcement is significantly different from that without an announcement, suggesting that participants generally revised their forecasts upward in response to an announcement. We conclude this section by stating that, on average, participants responded to the announcements, indicating that they are forward looking.

Fig 7.

Forward-lookingness of expectations (a) Empirical Distribution of Forecast Revisions After the First Announcement $\Delta f_{1}^{i}$ (b) Empirical Distribution of Forecast Revisions After the Second Announcement $\Delta f_{2}^{i}$

Notes: The p-value for the Mann-Whitney U test is denoted by pMW, while the one for the Kolmogorov-Smirnov test is represented by pKS. The REE benchmarks represent the theoretical values of the revisions, , when shocks are announced. The theoretical values of the revisions when shocks are not announced are trivially zero.

5.5. Individual expectation formation

Having analyzed average responses to anticipated shocks in the preceding section, we now turn to expectation formation in the absence of shocks. We investigate how each participant forms expectations during such normal periods within our multi-period framework.

Motivated by the work of Anufriev and Hommes (Reference Anufriev and Hommes2012), we estimate a reduced form forecasting rule. We generalize their forecasting rule by allowing coefficients to vary across different forecast horizons:

(11)

\begin{equation} f_{t,t+j}^{i}=\alpha_{j}^{i}+\gamma_{j}^{i}P_{t-1}+\omega_{j}^{i}\left(f_{t-1,t-1}-P_{t-1}\right)+\chi_{j}^{i}\left(P_{t-1}-P_{t-2}\right)+\varepsilon_{t,t+j}^{i}. \end{equation}

This reduced-form learning equation can capture various expectation formation mechanisms. For example, both adaptive heuristic and trend-following methods can be represented within this framework. Participant $i$ uses an adaptive heuristic if the forecast rule is given by

\begin{equation*} f_{t,t+j}^{i}=\gamma_j^i P_{t-1}+\omega_{j}^{i}\left(f_{t-1,t-1}-P_{t-1}\right). \end{equation*}

The coefficient $\omega_{j}^{i}$ captures the extent to which participant $i$ incorporates her past expectations into her current expectation. We generalize the adaptive heuristic in Anufriev and Hommes (Reference Anufriev and Hommes2012) by allowing the coefficient on $P_{t-1}$ to differ from one. Alternatively, participant $i$ adopts a trend-following method if her expectation is given by

\begin{equation*} f_{t,t+j}^{i}=\gamma_j^i P_{t-1}+\chi_{j}^{i}\left(P_{t-1}-P_{t-2}\right), \end{equation*}

where $\chi_{j}^{i}$ governs the strength of extrapolation; a higher $\chi_{j}^{i}$ results in greater extrapolation by the participant. As in the adaptive heuristic, we do not impose a unit coefficient on $P_{t-1}$.

Note that the reduced-form forecasting rule in Eq. (11) is entirely backward-looking. As discussed previously, some participants respond to the shocks and announcements. To concentrate on belief formation during stable periods, we eliminate the influence of forward-looking behavior by excluding any data from periods with shocks and announcements. Because we drop these observations, we do not differentiate between cases with and without announcements.

We begin our analysis by running the regression in Eq. (11) for each participant $i$. Similar to Anufriev and Hommes (Reference Anufriev and Hommes2012), the reduced-form forecasting rule in Eq. (11) captures most of the variations in forecasts. Table 3 reports in the column labeled $R^2$ the average $R^{2}$ for each treatment. The $R^{2}$ values are high across all treatments and horizons, but vary with respect to both. Specifically, $R^{2}$ decreases as the forecast horizon increases. Additionally, when $\beta=-0.9$, $R^{2}$ is higher than in other cases, likely reflecting the fact that prices fluctuate less when $\beta=-0.9$.

Table 3.

Average $R^2$ and classification of types across treatments and forecast horizons

Notes: We run separate regressions, $f_{t,t+j}^{i}=\alpha_{j}^{i}+\gamma_{j}^{i}P_{t-1}+\omega_{j}^{i}(f_{t-1,t-1}-P_{t-1})+\chi_{j}^{i}(P_{t-1}-P_{t-2})+\varepsilon_{t,t+j}^{i}$, for each participant and determine the $p$-values of the coefficients. The three elements in the triplets indicate the significance (1) or non-significance (0) of the coefficients $\gamma_{j}^{i}$, $\omega_{j}^{i}$, and $\chi_{j}^{i}$, respectively.

Individual-level regressions allow us to categorize participants into different types. Table 3 presents these classifications across various treatments and horizons, based on whether the coefficients from the regression in Eq. (11) are significant at $p \lt 0.1$. Participants are classified into types represented by triplets (e.g., 1-0-0), where each digit indicates the statistical significance (1) or non-significance (0) of the respective coefficients ( $\gamma_{i}^{j}$, $\omega_{i}^{j}$, $\chi_{i}^{j}$). The (1-0-0) type, which is characterized by adaptive expectation formation, makes up a large portion of participants, suggesting that many rely exclusively on past price information ( $P_{t-1}$) in their forecasting. By contrast, very few participants do not use past price information at all (0-*-*). Additionally, the (1-1-1) type is common, indicating that many participants employ both adaptive heuristic and trend-following strategies in their forecasts. Types (1-0-1) and (1-1-0), recognized as trend-chasers and adaptive-heuristic forecasters, are also significantly represented, demonstrating the diversity in how participants form their expectations. Thus, Table 3 reveals substantial heterogeneity in expectation formation among participants.

While Table 3 indicates how participants use past information, it does not reveal whether the magnitudes of the coefficients are economically meaningful. To assess the sizes of the coefficients, we report their conditional means and associated standard deviations among significant coefficients in Table 4. Three key observations emerge. First, participants predominantly use the most recent price, $P_{t-1}$, as a reference point when forming their forecasts, as evidenced by the coefficients $\gamma^i_j$ on $P_{t-1}$ being close to one across all forecast horizons.

Table 4.

Conditional means and standard deviations of coefficients

Notes: We run separate regressions, $f_{t,t+j}^{i}=\alpha_{j}^{i}+\gamma_{j}^{i}P_{t-1}+\omega_{j}^{i}(f_{t-1,t-1}-P_{t-1})+\chi_{j}^{i}(P_{t-1}-P_{t-2})+\varepsilon_{t,t+j}^{i}$, for each participant $i$. We compute the average values and standard deviations of the coefficients, conditional on their associated $p$-values being less than 10%.

Second, participants consider their own past forecast errors, $f_{t-1,t-1}-P_{t-1}$, indicating that individual expectations are somewhat self-referential. While the coefficients $\omega^i_j$ could theoretically be negative, the regression results suggest that they are positive and close to 0.5. This implies that participants gradually adjust their expectations even if their previous forecast exceeded the previous price: that is, $f_{t-1,t-1}^{i} \gt P_{t-1}$.

Finally, the influence of the most recent price change, $P_{t-1}-P_{t-2}$, becomes more pronounced as the forecast horizon extends, when the game exhibits positive feedback. Specifically, the coefficients $\chi_{j}^{i}$ on $P_{t-1}-P_{t-2}$ increase substantially from 0.53 for $j=0$ to 1.95 for $j=4$ when $\beta=0.9$. A formal statistical test supports this observed pattern. Specifically, we conduct a one-sided Mann-Whitney test to examine whether the median coefficient rises with the length of the horizon. The $p$-value for the difference between $j=0$ and $j=1$ is about 8%, dropping to roughly 0.2% for the comparison between $j=0$ and $j=4$. These findings suggest a statistically significant increase in the median coefficient as the horizon extends. This significant increase suggests that participants place greater importance on recent price trends when making longer-term forecasts, relying more on momentum or trend-following behavior for expectations about the distant future.

This last point merits further attention. In typical experimental settings where only one-period expectations are elicited, it is not possible to ascertain whether participants view price increases as persistent or temporary. This point becomes clearer when we compare the coefficients $\chi_{0}^{i}$ when $\beta=0.9$ to those when $\beta=-0.9$. Both coefficients are around 0.5, suggesting that participants anticipate continued price increases based on recent trends. If this were the only evidence available, one would conclude that participants employ similar extrapolation when $\beta=0.9$ and $\beta=-0.9$. However, our experiment can determine whether participants perceive price increases as temporary or persistent. Because multiple forecasts are elicited, we can estimate $\chi_{j}^{i}$ for $j=1,\cdots,4.$ As noted above, when $\beta=0.9$, participants expect higher and higher prices, indicating that they believe the trend will persist. Conversely, when $\beta=-0.9$, participants do not clearly expect prices to keep rising, suggesting they view the price increase as temporary. This highlights the novelty of our experiment in allowing us to discern participants’ perceptions of the persistence of price changes.

To further examine this feature, we run a regression with interaction terms for each participant $i$:

\begin{align*} f_{t,t+j}^{i}-f_{t,t}^{i} & =\alpha^{i}+\text{FE}_{j}+\gamma^{i}P_{t-1}+\omega^{i}\left(f_{t-1,t-1}-P_{t-1}\right)+\chi^{i}\left(P_{t-1}-P_{t-2}\right)\\ & +\tilde{\gamma}^{i}jP_{t-1}+\tilde{\omega}^{i}j\left(f_{t-1,t-1}-P_{t-1}\right)+\tilde{\chi}^{i}j\left(P_{t-1}-P_{t-2}\right), \end{align*}

where $\text{FE}_{j}$ represents the fixed effects controlling for horizons. Our primary interest lies in the coefficients $\tilde{\gamma}^{i}$, $\tilde{\omega}^{i}$, and $\tilde{\chi}^{i}$, which govern the strength of the interaction terms. These coefficients capture the change in anchoring, adaptiveness, and trend-chasing as the horizon increases by one period. Based on the discussion above, we would expect $\tilde{\chi}^{i}$ to be significant and positive when $\beta=0.9$.

The regression results are summarized in Table 5. As in previous tables, we report the conditional means of these coefficients among cases where the $p$-values are under 10%. The associated standard deviations are reported in parentheses, along with the fraction of participants whose coefficients exhibit statistical significance below the 10% level.

Table 5.

Summary statistics of interaction terms by treatment

Note: We report the conditional means of the coefficients with standard deviations in parentheses, along with the percentage of participants for whom the coefficients are significant below the 10% level.

Table 5 shows that while participants exhibit statistically significant coefficients on the interaction terms involving horizon and other variables, these effects are economically negligible for all cases except those involving the most recent price change, $P_{t-1}-P_{t-2}$. Specifically, consider the interaction terms with $P_{t-1}$ and $(f_{t-1,t-1}-P_{t-1})$. Although these interaction coefficients, $\tilde{\gamma}^{i}$ and $\tilde{\omega}^{i}$, are statistically different from zero for a substantial fraction of participants, their magnitudes remain small. In other words, even though some participants adjust their reliance on current price levels or past forecast errors as they look further into the future, these adjustments are tiny. Thus, for these variables, changes in horizon have essentially no meaningful effect on participants’ forecasts.

Turning to the interaction involving $P_{t-1}-P_{t-2}$, we first consider cases with strategic substitutes ( $\beta \lt 0$). Here, while the interaction coefficients $\tilde{\chi}^{i}$ are often statistically significant, they also remain economically small. In these environments, participants do not meaningfully increase their reliance on recent price trends when forming longer-horizon forecasts, implying that even if prices have risen recently, participants tend to view such increases as temporary and not indicative of persistent inflation.

In contrast, when facing strategic complements ( $\beta \gt 0$), the interaction coefficients for $P_{t-1}-P_{t-2}$ are large enough to be economically important. For example, a 10-unit increase in $P_{t-1}-P_{t-2}$ leads to a 16-unit increase in five-period-ahead price forecasts under strategic complementarity. Participants extrapolate upward trends more aggressively as the forecast horizon extends and anticipate that rising prices will persist. This new effect of the strategic environment—horizon-dependent extrapolation occurring only in environments with strategic complementarity—is a novel finding within our new multi-periods setting.

Our findings have important implications for monetary policy, which depends delicately on how agents form expectations. In environments characterized by strategic complementarity, our results demonstrate that agents’ long-run inflation expectation could be de-anchored. This suggests that central banks need to be mindful of these variations in expectation formation, especially because it is costly to re-anchor inflation expectations. Conversely, in environments with strategic substitutes, the nature of forecasting remains relatively stable across horizons. Thus, central banks could conduct their monetary policy without worrying too much about de-anchoring, which simplifies their operations.

Before closing this section, it is worth highlighting a point that warrants caution. As noted in footnote 10, our design may nudge some participants toward forward-looking behavior by explicitly asking them to forecast several future periods rather than only the next period. It is not evident whether individuals would form such expectations in more typical settings without such instructions. One way to explore this possibility is to conduct an experiment in which participants directly choose their optimal price in a dynamic environment, rather than engage in a learning-to-forecast experiment. While this alternative design allows us to analyze how participants form their forecasts without explicitly asking them to forecast several future periods, it may introduce optimization errors by participants. A carefully designed experiment may help isolate expectation formation in a dynamic environment without either optimization errors or conflating it with the nudge effect. We believe that it is important to explore this question since our findings have important policy implications for possible de-anchoring of expectations. We leave such analysis for future research.

6. Concluding remarks

This paper introduces and experimentally examines a multi-period extension of the standard beauty contest game, inspired by the Calvo (Reference Calvo1983) pricing model prevalent in macroeconomics. Based on this new dynamic experimental framework, we investigate how participants respond to shocks and announcements and how they form their expectations without such disturbances.

Exploiting this multi-period setup, we propose a novel test for forward-looking behavior. Our results show that participants revise their forecasts after announcements of future shocks, demonstrating that, on average, they incorporate information about future states into their current expectations.

We also identify a new strategic environment effect in individual expectation formation. We find that trend extrapolation varies with the strategic setting. In the presence of strategic complementarity, participants strongly extrapolate current price changes to the long run. Moreover, this trend-chasing behavior intensifies as the forecast horizon lengthens, implying that long-run expectations can become de-anchored, which in turn generates greater macroeconomic instability. Under strategic substitution, by contrast, price movements are perceived as more temporary, and trend extrapolation does not intensify with the forecast horizon, resulting in more stable long-run expectations.

These results have policy implications. In real-world macroeconomic and financial systems, strategic complementarity is prevalent and agents plan over extended horizons. It is often found in the literature that longer planning horizons tend to stabilize aggregate dynamics by allowing agents to anchor on fundamentals. Our results suggest the opposite: in the presence of strategic complementarity, longer horizons can amplify destabilizing momentum-driven dynamics. Policymakers should be aware of this mechanism and recognize that managing expectations requires particular vigilance in such environments. In all cases, policy design should reflect the strategic context in which agents form their expectations.

Finally, our study suggests directions for future work. Of particular interest is the role that variations in the strategic environment play in shaping individual forecasting rules, and how policies can best steer expectations in environments prone to self-fulfilling dynamics. Our multi-period experimental design can serve as a useful tool for examining this issue.

Acknowledgements

We thank Elena Asparouhova, Te Bao, John Duffy, Rosemarie Nagel, Charles Noussair, Yoshiyasu Ono, Lube Petersen, Marc Willinger, and the participants at the 2023 CEFM workshop (Nagasaki), AP-ESA 2023 (Seoul), EF 2023 (Sofia), 2023 ESA World Meeting (Lyon), and 2024 Experimental Macro Workshop (Singapore) for their comments and helpful discussions. This research has benefited from the financial support of (a) the Joint Usage/Research Center (JJURC), the Institute of Social and Economic Research (ISER), and Osaka University, and (b) Grants-in-aid for Scientific Research Nos. 18K19954, 20H05631, 23H00055, and 25H00388 from the Japan Society for the Promotion of Science. The design of the experiment reported in this paper was approved by the IRB of ISER (#20221102) in November 2022. The experiment is preregistered at aspredicted.org (#128898), which is available at https://aspredicted.org/ix6th.pdf. We thank Yuka Tamura and Yuta Shimodaira for their help in preparing the software for the experiment, and Yuki Hamada, Hiroko Shibata, Saori Shikibu, and Satsuki Yamada for their support in conducting the experiment. The authors used ChatGPT (OpenAI) and Claude (Anthropic) for English language editing and proofreading. All content was reviewed and approved by the authors, who take full responsibility for the accuracy and integrity of the manuscript. The replication material for the study is available at https://doi.org/10.17605/OSF.IO/GECV8.

Footnotes

¹ Note that unlike experiments in which participants actually trade in the market (e.g., Akiyama et al., Reference Akiyama, Hanaki and Ishikawa2017; Asparouhova et al., Reference Asparouhova, Bossaerts, Roy and Zame2016; Crockett et al., Reference Crockett, Duffy and Izhakian2019; Haruvy et al., Reference Haruvy, Lahav and Noussair2007; Smith et al., Reference Smith, Suchanek and Williams1988), set prices (e.g., Fehr & Tyran, Reference Fehr and Tyran2008; Noussair et al., Reference Noussair, Pfajfar and Zsiros2015; Orland & Roos, Reference Orland and Roos2013; Orland & Roos, Reference Orland and Roos2019; Petersen, Reference Petersen2015), or set quantities (e.g., Bosch-Domènech & Vriend, Reference Bosch-Domènech and Vriend2003; Huck et al., Reference Huck, Normann and Oechssler1999; Huck et al., Reference Huck, Normann and Oechssler2004; Offerman et al., Reference Offerman, Potters and Sonnemans2002), participants in learning-to-forecast experiments do not trade, set prices, or determine quantities; they only forecast. This eliminates the need for participants to trade or set prices (or quantities) optimally and thus allows us to focus on their expectation formation and its aggregate consequences. See Bao et al. (Reference Bao, Duffy and Hommes2013) for a comparison between learning-to-forecast experiments and those where participants need to decide on the quantities (which those authors call “learning-to-optimize” experiments).

² Learning-to-forecast experiments can be broadly categorized into those based on univariate environments and those based on multivariate environments. The former correspond mostly to those based on an asset-pricing model, such as Hommes et al. (Reference Hommes, Sonnemans, Tuinstra and van de Velden2005), Heemeijer et al. (Reference Heemeijer, Hommes, Sonnemans and Tuinstra2009), Bao et al. (Reference Bao, Hommes and Makarewicz2017), Anufriev et al. (Reference Anufriev, Chernulich and Tuinstra2022a), and Evans et al. (Reference Evans, Hommes, McGough and Salle2022), as well as simple demand and supply dynamics, as in Bao and Duffy (Reference Bao and Duffy2016), Bao and Hommes (Reference Bao and Hommes2019), and Evans et al. (Reference Evans, Gibbs and McGough2025). The last paper, Evans et al. (Reference Evans, Gibbs and McGough2025), is one of the most closely related with our paper. The main difference is that we study a multi-period beauty contest game and analyze how participants respond to anticipated shocks, not unanticipated shocks. Experiments using multivariate environments largely correspond to those based on the New Keynesian macroeconomic model, such as Adam (Reference Adam2007), Pfajfar and Zakelj (Reference Pfajfar and Zakelj2014); Pfajfar and Zakelj (Reference Pfajfar and Zakelj2018), Arifovic and Petersen (Reference Arifovic and Petersen2017), Hommes et al. (Reference Hommes, Massaro and Salle2019), Rholes and Petersen (Reference Rholes and Petersen2021), Assenza et al. (Reference Assenza, Heemeijer, Hommes and Massaro2021), Mauersberger (Reference Mauersberger2021), Petersen and Rhoes (Reference Petersen and Rhoes2022), Ahrens et al. (Reference Ahrens, Lustenhouwer and Tettamanzi2023), Lustenhouwer and Salle (Reference Lustenhouwer and Salle2025), as well as those based on more abstract settings (Anufriev et al., Reference Anufriev, Duffy and Panchenko2022b; Levelt et al., Reference Levelt, Hommes and Hennequin2021). These multivariate learning-to-forecast experiments based on the New Keynesian model are used to investigate policy-relevant questions such as the effectiveness of different monetary policy rules (Assenza et al., Reference Assenza, Heemeijer, Hommes and Massaro2021; Cornand & M’Baye, Reference Cornand and M’Baye2018; Kostyshyna et al., Reference Kostyshyna, Petersen and Yang2022; Mauersberger, Reference Mauersberger2021; Pfajfar & Zakelj, Reference Pfajfar and Zakelj2014; Pfajfar & Zakelj, Reference Pfajfar and Zakelj2018), provisions of central bank projections and communications (Ahrens et al., Reference Ahrens, Lustenhouwer and Tettamanzi2023; Arifovic et al., Reference Arifovic, Salle and Truong2023; Kryvtsov & Petersen, Reference Kryvtsov and Petersen2021; Mokhtarzadeh & Petersen, Reference Mokhtarzadeh and Petersen2021), and the effectiveness of these policies at the zero lower bound (Arifovic & Petersen, Reference Arifovic and Petersen2017; Hommes et al., Reference Hommes, Massaro and Salle2019).

³ There are a few recent exceptions in both univariate and multivariate environments. In the univariate environment, Colassante et al. (Reference Colassante, Alfarano, Camacho-Cuena and Gallegati2020) elicit a series of forecasts for multiple future periods; however, unlike our framework, the forecasts beyond the one for the next period do not determine the market outcome. In the multivariate environment, Adam (Reference Adam2007), Rholes and Petersen (Reference Rholes and Petersen2021), and Petersen and Rhoes (Reference Petersen and Rhoes2022) elicit one- and two-period-ahead forecasts of inflation rates, and determine current outcomes. In Lustenhouwer and Salle (Reference Lustenhouwer and Salle2025), participants submit their forecasts for output in multiple future periods, and these forecasts determine the current outcome. Our proposed experiment is in the univariate environment, and all the submitted forecasts can potentially determine the current outcome.

⁴ This approach stands in stark contrast to survey-based studies on expectation formation, which elicit expectations of the same future periods multiple times and study their evolution (e.g., Coibion et al., Reference Coibion, Gorodnichenko and Saten2018).

⁵ The strategic environment effect has also been demonstrated in other contexts, including a price-setting game (Cooper et al., Reference Cooper, Schneider and Waldman2021; Fehr & Tyran, Reference Fehr and Tyran2008; Funaki et al., Reference Funaki, Kawamura, Uto and Ueda2023), a duopoly game (Potters & Suetens, Reference Potters and Suetens2009), and one-shot beauty contest games (Hanaki et al., Reference Hanaki, Koriyama, Sutan and Willanger2019; Sutan & Willinger, Reference Sutan and Willinger2009). More recently, Evans et al. (Reference Evans, Hommes, McGough and Salle2022) and Anufriev et al. (Reference Anufriev, Chernulich and Tuinstra2022a) show that the strategic environment effect attenuates in dynamic settings that differ from ours. Their frameworks are based on asset-pricing models and elicit forecasts either for the average across multiple future periods (Evans et al., Reference Evans, Hommes, McGough and Salle2022) or for a single distant period (Anufriev et al., Reference Anufriev, Chernulich and Tuinstra2022a).

⁶ In our experimental framework, only the average forecast is of primary importance. In broader contexts, however, it may be necessary to examine higher-order moments of the forecast distribution—a line of inquiry reserved for future research.

⁷ In New Keynesian models, the time horizon $T$ is typically set to infinity, consistent with the assumption of firms with infinite lives. By contrast, experimental constraints make it unfeasible to collect an unbounded sequence of forecasts. Accordingly, we impose a finite time horizon $T$ in our experimental design.

⁸ One may consider a group of six to be too small for an experiment to have macroeconomic implications. However, the main results of existing learning-to-forecast experiments do not change even if conducted with larger groups of 20 to 30 participants (Bao et al., Reference Bao, Hennequin, Hommes and Massaro2020) or even close to 100 participants (Hommes et al., Reference Hommes, Kopányi-Peuker and Sonnemans2021).

⁹ All forecasts are constrained to be integers.

¹⁰ Our experimental design may prompt participants to adopt a more forward-looking reasoning compared to existing univariate learning-to-forecast experiments because we ask participants to provide forecasts for several future periods rather than only one.

¹¹ This way of rewarding forecast accuracy is also used in, for example, Adam (Reference Adam2007), Assenza et al. (Reference Assenza, Heemeijer, Hommes and Massaro2021), and Anufriev et al. (Reference Anufriev, Duffy and Panchenko2022b).

¹² If the new forecasts in period 3 and 4 do not become payoff-relevant, then $F_3^i = f^i_{1,3} = 12$ and $F_4^i = f_{1,4}^i = 12.$

¹³ Due to an oversight in the experiment, the price was determined with $\gamma=0$ instead of $\gamma=0.05$. The incorrect price determination equation was also communicated to participants in the instructions (see Appendix F). As shown in Appendix B, however, this oversight results in the REE price differing only slightly (by less than one). Furthermore, because $P_{t}$ is rounded to the nearest integer, we believe this oversight does not affect the experimental outcomes.

¹⁴ In our experiment, this information is communicated to the participants with the sequence of random numbers that determine the termination of the game in each period displayed on the same screen.

¹⁵ In the instructions, participants are informed that the multi-period game can be repeated if all groups complete a game within 30 minutes from the first game’s start. When a new game starts, participants are randomly re-grouped into six (requiring everyone in the session to wait until all groups have finished), and they play with the same parameter values. If the game is repeated multiple times, one of the games will be randomly selected for payment; see Appendix F.

¹⁶ Because participants are restricted to submitting integer-valued forecasts, multiple equilibria may emerge under strong strategic complementarities. However, all such equilibria are local to the REE examined in the main text. As a result, this multiplicity does not materially affect our findings. See Appendix D for a detailed discussion.

¹⁷ With this specification, 75% ( $ = 1 - (0.5)^2$) of participants can change their forecasts before the realization of the shocks.

¹⁸ See Appendix E for details regarding how we conducted our online experiment.

¹⁹ We did not add a button for participants to review the instructions on the main decision screen during the experiment because we considered having correctly answered all the questions in the comprehension quiz to demonstrate that they understood the instructions.

²⁰ However, in one session with an announcement of the future shock with $\beta=0.9$, one participant decided to leave the experiment while answering the comprehension quiz. The experimenter replaced this participant, allowing the experiment to continue. We exclude this group’s data from the analyses, but because we did the same treatment in which 30 participants showed up, we have a total of eight groups for this treatment. Furthermore, in one session without announcement of the shock and $\beta=-1.8$, one of the participants lost his or her internet connection around period 11 and switched to a different mode of connection. As a result, an error occurred, and the experiment could not continue for this group. In addition, in the same session, an error occurred for a group in period 40. Thus, we only have data for seven groups for this treatment, with one group missing the prices and forecasts submitted in period 40. Finally, in the treatment with announcement of the shocks and $\beta=-0.9$, one group encountered a technical problem, and the experiment stopped in period 33. Thus, while we have data for eight groups for this treatment, one lacks the data from period 34 onward.

²¹ The exchange rate between points earned during the experiments and JPY was 1 point = 2 JPY.

²² This result seems to contradict Assenza et al. (Reference Assenza, Heemeijer, Hommes and Massaro2021), Pfajfar and Zakelj (Reference Pfajfar and Zakelj2014); Pfajfar and Zakelj (Reference Pfajfar and Zakelj2018), and Mauersberger (Reference Mauersberger2021), who study the effectiveness of monetary policy in the New Keynesian learning-to-forecast experiment. These studies demonstrate that a more forceful monetary policy response to deviations from the inflation target enhances the strength of the negative feedback mechanism, thereby improving the stabilization of aggregate dynamics. However, it is possible that even the strongest monetary policy reaction considered in these experiments is not strong enough to make the degree of strategic substitutability sufficiently negative, as considered in our experiment and those by Bao and Duffy (Reference Bao and Duffy2016) and Evans et al. (Reference Evans, Gibbs and McGough2025). This is because the Calvo pricing friction attenuates the impact of monetary policy on the sensitivity of expectation.

²³ Here we use the REE price based on the price determination equation used in the experiment: namely, the one that set $\gamma=0$. See footnote 13.

²⁴ They are all significantly different from zero at the 5% level, according to the signed-rank test. The $p$-values are 0.016 for $\beta=-1.8$ without an announcement and 0.008 for the other five treatments.

²⁵ Based on the Kruskal–Wallis test, $p \lt 0.05$ for both cases. For pairwise comparisons, $p$-values based on the two-tailed Mann-Whitney test are always less than 1%.

²⁶ For pairwise comparisons, $p$-values based on the two-tailed Mann-Whitney test are 0.0002 ( $\beta=0.9$ vs. $\beta=-0.9$), 0.0012 ( $\beta=0.9$ vs. $\beta=-1.8$), and 0.1520 ( $\beta=-0.9$ vs. $\beta=-1.8$) without an announcement and 0.0650 ( $\beta=0.9$ vs. $\beta=-0.9$), 0.0379 ( $\beta=0.9$ vs. $\beta=-1.8$), and 0.2345 ( $\beta=-0.9$ vs. $\beta=-1.8$) with an announcement.

²⁷ In fact, RDs are not significantly different from zero for $\beta=-1.8$ without an announcement ( $p=0.6875$, signed-rank test), while they are significantly different from zero at the 5% significance level for the other five other treatments ( $p=0.0391$ for $\beta=0.9$ with an announcement and 0.0078 for the remaining four treatments, signed-rank test).

²⁸ We also examine how individuals respond to unanticipated shocks, as analyzed by Evans et al. (Reference Evans, Gibbs and McGough2025), in Appendix C.

²⁹ Note that this condition is sufficient but not necessary. As explained below, even if an individual revises her forecasts, she might do so for another reason and ignore the announcement.

³⁰ It is important to emphasize that we cannot compare $\Delta f_{1}^{i}$ and $\Delta f_{2}^{i}$ with the actual price changes in our data, $P_{14}-P_{13}$ and $P_{29}-P_{28}$. This is because $\Delta f_{1}^{i}$ and $\Delta f_{2}^{i}$ are revisions of the forecasts over $P_{14}-P_{13}$ and $P_{29}-P_{28}$, not the forecasts of these price changes.

³¹ Indeed, we can consider a revision of longer price change, defined as $\Delta f_{1}^{i} = E^{i}\left(P_{15} - P_{12} \mid t \leq 12\right) - E^{i}\left(P_{15} - P_{12} \mid t \leq 11\right)$. The analysis results below remain unchanged when using this measure.

³² While Figure 5 indicates some heterogeneity across treatments, most differences are not statistically significant. We apply pairwise proportion tests to compare the shares of overreactions, underreactions, and no revisions. In most cases, the $p$-values exceed 10%. A notable exception is the comparison between $\beta = 0.9$ and $\beta = -0.9$, where the fraction of participants who overreact to the second anticipated shock differs significantly.

³³ Note that our classification exercise differs from that of Evans et al. (Reference Evans, Gibbs and McGough2025). They classify the forecasts right after the shocks without pre-announcements, not revisions, and classify current forecasts according to level- $k$. In Appendix C, we conduct the same exercise conducted by Evans et al. (Reference Evans, Gibbs and McGough2025) in our dynamic setting based on the treatment without an announcement.

³⁴ One potential difficulty with our analysis is that unlike Evans et al. (Reference Evans, Gibbs and McGough2025), we must specify the sequence of default forecasts. In a dynamic environment such as ours, there is little reason to assume that these defaults remain constant. Exploring how to model level- $k$ forecasts without fixed defaults and how participants select their initial forecast sequence in a dynamic environment is left for future research.

³⁵ Our experimental design includes only two shocks and two corresponding announcements. If participants were exposed to more than two announcements, it is possible that learning would emerge over time. We leave this question for future research.

References

Adam, K. (2007). Experimental evidence on the persistence of output and inflation. Economic Journal, 117(520), 117.10.1111/j.1468-0297.2007.02043.xCrossRef Google Scholar

Ahrens, S., Lustenhouwer, J., & Tettamanzi, M. (2023). The stabilizing effects of publishing strategic central bank projections. Macroeconomic Dynamics, 27(3), 826–868.10.1017/S1365100521000687CrossRef Google Scholar

Akiyama, E., Hanaki, N., & Ishikawa, R. (2017). It is not just confusion! Strategic uncertainty in an experimental asset market. Economic Journal, 127 (605), F563–F580.10.1111/ecoj.12338CrossRef Google Scholar

Angeletos, G. M., & Lian, C. (2018). Forward guidance without common knowledge. American Economic Review, 108(9), 2477–2512.10.1257/aer.20161996CrossRef Google Scholar

Anufriev, M., Chernulich, A., & Tuinstra, J. (2022a). Asset price volatility and investment horizons: An experimental investigation. Journal of Economic Behavior and Organization, 193, 19–48.10.1016/j.jebo.2021.11.019CrossRef Google Scholar

Anufriev, M., Duffy, J., & Panchenko, V. (2022b). Learning in two-dimensional beauty contest games: Theory and experimental evidence. Journal of Economic Theory, 201 (C), 105417.10.1016/j.jet.2022.105417CrossRef Google Scholar

Anufriev, M., & Hommes, C. (2012). Evolutionary selection of individual expectations and aggregate outcomes in asset pricing experiments. American Economics Journal: Microeconomics, 4 (4), 35–64.Google Scholar

Arifovic, J., & Petersen, L. (2017). Stabilizing expectations at the zero lower bound: Experimental evidence. Journal of Economic Dynamics and Control, 82 (C), 21–43.10.1016/j.jedc.2017.04.005CrossRef Google Scholar

Arifovic, J., Salle, I., & Truong, H. (2023) History-dependent monetary policy: A lab experiment and a HENK model. Working paper 4586602, SSRN.10.2139/ssrn.4586602CrossRef Google Scholar

Asparouhova, E., Bossaerts, P., Roy, N., & Zame, W. (2016). Lucas’ in the laboratory. Journal of Finance, 71(6), 2727–2779.10.1111/jofi.12392CrossRef Google Scholar

Assenza, T., Heemeijer, P., Hommes, C., & Massaro, D. (2021). Managing self-organization of expectations through monetary policy: A macro experiment. Journal of Monetary Economics, 117 (C), 170–186.10.1016/j.jmoneco.2019.12.005CrossRef Google Scholar

Bao, T., & Duffy, J. (2016). Adaptive versus eductive learning: Theory and evidence. European Economic Review, 83 (C), 64–89.10.1016/j.euroecorev.2015.12.007CrossRef Google Scholar

Bao, T., Duffy, J., & Hommes, C. (2013). Learning, forecasting, and optimizing: An experimental study. European Economic Review, 61 (C), 186–204.10.1016/j.euroecorev.2013.04.003CrossRef Google Scholar

Bao, T., Hennequin, M., Hommes, C., & Massaro, D. (2020). Coordination on bubbles in large-group asset pricing experiments. Journal of Economic Dynamics and Control, 110 (C), 103702.10.1016/j.jedc.2019.05.009CrossRef Google Scholar

Bao, T., & Hommes, C. (2019). When speculators meet suppliers: Positive versus negative feedback in experimental housing markets. Journal of Economic Dynamics and Control, 107 (C), 103730.10.1016/j.jedc.2019.103730CrossRef Google Scholar

Bao, T., Hommes, C., & Makarewicz, T. (2017). Bubble formation and (in)efficient markets in learning-to-forecast and -optimize experiments. The Economic Journal, 127 (605), F581–F609.10.1111/ecoj.12341CrossRef Google Scholar

Bao, T., Hommes, C., & Pei, J. (2021). Expectation formation in finance and macroeconomics: A review of new experimental evidence. Journal of Behavioral and Experimental Finance, 32 (C), 100591.10.1016/j.jbef.2021.100591CrossRef Google Scholar

Bao, T., Hommes, C., Sonnemans, J., & Tuinstra, J. (2012). Individual expectations, limited rationality and aggregate outcomes. Journal of Economic Dynamics and Control, 36 (8), 1101–1120.10.1016/j.jedc.2012.03.006CrossRef Google Scholar

Bosch-Domènech, A., & Vriend, N. J. (2003). Imitation of successful behaviour in cournot markets. The Economic Journal, 113 (487), 495–524.10.1111/1468-0297.00129CrossRef Google Scholar

Calvo, G. A. (1983). Staggered prices in a utility maximizing framework. Journal of Monetary Economics, 12(3), 383–398.10.1016/0304-3932(83)90060-0CrossRef Google Scholar

Chen, D. L., Schonger, M., & Wicken, C. (2016). oTree - An open-source platform for laboratory, online and field experiments. Journal of Behavioral and Experimental Finance, 9 (C), 88–97.10.1016/j.jbef.2015.12.001CrossRef Google Scholar

Coibion, O., Gorodnichenko, Y., & Saten, K. (2018). How do firms form their expectations? New survey evidence. American Economic Review, 108 (9), 2671–2713.10.1257/aer.20151299CrossRef Google Scholar

Colassante, A., Alfarano, S., Camacho-Cuena, E., & Gallegati, M. (2020). Long-run expectations in a learning-to-forecast experiment: A simulation approach. Journal of Evolutionary Economics, 30 (1), 75–116.CrossRef Google Scholar

Cooper, K., Schneider, H., & Waldman, M. (2021). Limited rationality and strategic environment: Further evidence from a pricing game. Journal of Behavioral and Experimental Economics, 90 (C), 101632.10.1016/j.socec.2020.101632CrossRef Google Scholar

Cornand, C., & M’Baye, C. K. (2018). Does inflation targeting matter? An experimental investigation. Macroeconomic Dynamics, 22 (2), 362–401.10.1017/S1365100516000250CrossRef Google Scholar

Crockett, S., Duffy, J., & Izhakian, Y. (2019). An experimental test of the Lucas asset pricing model. Review of Economic Studies, 86 (2), 627–667.CrossRef Google Scholar

Duffy, J. (2017). Macroeconomics: A survey of laboratory research. In Kagel, J. H. and Roth, A. E. (Eds.) Handbook of experimental economics (Vol 2, pp.1–90) Princeton University Press.Google Scholar

Duffy, J., & Puzzello, D. (2014). Experimental evidence on the essentiality and neutrality of money in a search model. Research in Experimental Economics, 17, 259–311.Google Scholar

Duffy, J., & Puzzello, D. (2022). The friedman rule: Experimental evidence. International Economic Review, 63(2), 671–698.10.1111/iere.12549CrossRef Google Scholar

Evans, G. W., Gibbs, C. G., & McGough, B. (2025). A unified model of learning to forecast. American Economic Journal: Macroeconomics, 17 (2), 101–33.Google Scholar

Evans, G. W., Hommes, C., McGough, B., & Salle, I. (2022). Are long-horizon expectations (De-)stabilizing? Theory and experiments. Journal of Monetary Economics, 132 (C), 44–63.CrossRef Google Scholar

Farhi, E., & Werning, I. (2019). Monetary policy, bounded rationality, and incomplete market. American Economic Review, 109 (11), 3887–3928.10.1257/aer.20171400CrossRef Google Scholar

Fehr, E., & Tyran, J. -R. (2008). Limited rationality and strategic interaction: The impact of the strategic environment on nominal inertia. Econometrica, 76 (2), 353–394.CrossRef Google Scholar

Fréchette, G. R., & Yuksel, S. (2017). Infinitely repeated games in the laboratory: Four perspectives on discounting and random termination. Experimental Economics, 20 (2), 279–308.10.1007/s10683-016-9494-zCrossRef Google Scholar

Funaki, Y., Kawamura, K., Uto, N., & Ueda, K. (2023) Price stickiness and strategic uncertainty: An experimental study, CAMA Working Paper 63/2023. Crawford School of Public Policy, Australian National University.Google Scholar

Galí, J. (2015). Monetary policy, Inflation and the Business cycle. (2nd ed.) Princeton University Press.Google Scholar

García-Schmidt, M., & Woodford, M. (2019). Are low interest rates deflationary? A paradox of perfect-foresight analysis. American Economic Review, 109 (1), 86–120.CrossRef Google Scholar

Greiner, B. (2015). An online recruitment system for economic experiments. Journal of the Economic Science Association, 1 (1), 114–125.10.1007/s40881-015-0004-4CrossRef Google Scholar

Hanaki, N., Koriyama, Y., Sutan, A., & Willanger, M. (2019). The strategic environment effect in beauty contest games. Games and Economic Behavior, 113 (C), 587–610.10.1016/j.geb.2018.11.006CrossRef Google Scholar

Haruvy, E., Lahav, Y., & Noussair, C. N. (2007). Traders’ expectations in asset markets: experimental evidence. American Economics Review, 97 (5), 1901–1920.10.1257/aer.97.5.1901CrossRef Google Scholar

Heemeijer, P., Hommes, C., Sonnemans, J., & Tuinstra, J. (2009). Price stability and volatility in markets with positive and negative expectations feedback: An experimental investigation. Journal of Economic Dynamics and Control, 33 (5), 1052–1072.10.1016/j.jedc.2008.09.009CrossRef Google Scholar

Hommes, C., Kopányi-Peuker, A., & Sonnemans, J. (2021). Bobbles, crashes and information contagion in large-group asset market experiments. Experimental Economics, 24 (2), 414–433.10.1007/s10683-020-09664-wCrossRef Google Scholar

Hommes, C., Massaro, D., & Salle, I. (2019). Monetary and fiscal policy design at the zero lower bound: Evidence from the lab. Economic Inquiry, 57 (2), 1120–1140.CrossRef Google Scholar

Hommes, C., Sonnemans, J., Tuinstra, J., & van de Velden, H. (2005). Coordination of expectations in asset pricing experiments. Review of Financial Studies, 18 (3), 955–980.10.1093/rfs/hhi003CrossRef Google Scholar

Huck, S., Normann, H. -T., & Oechssler, J. (1999). Learning in cournot oligopoly - An experiment. Economic Journal, 109 (454), C80–C95.10.1111/1468-0297.00418CrossRef Google Scholar

Huck, S., Normann, H. -T., & Oechssler, J. (2004). Two are few and four are many: Number effects in experimental oligopolies. Journal of Economic Behavior and Organization, 53 (4), 435–446.10.1016/j.jebo.2002.10.002CrossRef Google Scholar

Kostyshyna, O., Petersen, L., & Yang, J. (2022). A horse race of monetary policy regimes: An experimental investigation. Working Paper 30530, National Bureau of Economic Research.10.3386/w30530CrossRef Google Scholar

Kryvtsov, O., & Petersen, L. (2021). Central bank communication that works: Lessons from lab experiments. Journal of Monetary Economics, 117 (C), 760–780.10.1016/j.jmoneco.2020.05.001CrossRef Google Scholar

Levelt, E., Hommes, C. H., & Hennequin, M. (2021) Planar learning to forecast market games. Working paper 3882461, SSRN.CrossRef Google Scholar

Lustenhouwer, J., & Salle, I. (2025). Learning to be rational in the presense of news: A lab investigation. European Economic Review, 172 (C), 104948.10.1016/j.euroecorev.2024.104948CrossRef Google Scholar

Marimon, R., & Sunder, S. (1993). Indeterminacy of equilibria in a hyperinflationary world: Experimental evidence. Econometrica, 61 (5), 1073–1107.10.2307/2951494CrossRef Google Scholar

Mauersberger, F. (2021). Monetary policy rules in a non-rational world: A macroeconomic experiment. Journal of Economic Theory, 197 (C), 105203.CrossRef Google Scholar

Mokhtarzadeh, F., & Petersen, L. (2021). Coordinating expectations through central bank projections. Experimental Economics, 24 (3), 883–918.10.1007/s10683-020-09684-6CrossRef Google Scholar PubMed

Nagel, R. (1995). Unraveling in guessing games: An experimental study. American Economics Review, 85 (5), 1313–1326.Google Scholar

Noussair, C. N., Pfajfar, D., & Zsiros, J. (2015). Pricing decisions in an experimental dynamic stochastic general equilibrium economy. Journal of Economic Behavior and Organization, 109 (C), 188–205.10.1016/j.jebo.2014.10.016CrossRef Google Scholar

Offerman, T., Potters, J., & Sonnemans, J. (2002). Imitation and belief learning in an oligopoly experiment. Review of Economic Studies, 69 (4), 973–997.CrossRef Google Scholar

Orland, A., & Roos, M. W. M. (2013). The New Keynesian Phillips curve with myopic agents. Journal of Economic Dynamics and Control, 37 (10), 2270–2286.10.1016/j.jedc.2013.05.015CrossRef Google Scholar

Orland, A., & Roos, M. W. M. (2019). Price-setting with quadratic adjustment costs: Experimental evidence. Journal of Economic Behavior and Organization, 163 (C), 88–116.CrossRef Google Scholar

Petersen, L. (2015). Do expectations and decisions respond to monetary policy?. Journal of Economic Studies, 42 (6), 972–1004.10.1108/JES-08-2015-0149CrossRef Google Scholar

Petersen, L., & Rhoes, R. (2022). Macroeconomic expectations, central bank communication, and background uncertainty: A COVID-19 laboratory experiment. Journal of Economic Dynamics and Control, 143 (C), 104460.10.1016/j.jedc.2022.104460CrossRef Google Scholar PubMed

Pfajfar, D., & Zakelj, B. (2014). Experimental evidence on inflation expectation formation. Journal of Economic Dynamics and Control, 44 (C), 147–168.CrossRef Google Scholar

Pfajfar, D., & Zakelj, B. (2018). Inflation expectations and monetary policy design: Evidence from the laboratory. Macroeconomic Dynamics, 22 (4), 1035–1075.10.1017/S1365100516000560CrossRef Google Scholar

Potters, J., & Suetens, S. (2009). Cooperation in experimental games of strategic complements and subsitutes. Review of Economic Studies, 76 (3), 1125–1147.10.1111/j.1467-937X.2009.00548.xCrossRef Google Scholar

Rholes, R., & Petersen, L. (2021). Should central banks communicate uncertainty in their projections?. Journal of Economic Behavior and Organization, 183 (C), 320–341.10.1016/j.jebo.2020.11.013CrossRef Google Scholar

Smith, V. L., Suchanek, G. L., & Williams, A. W. (1988). Bubbles, crashes, and endogenous expectations in experimental spot asset markets. Econometrica, 56 (5), 1119–1151.10.2307/1911361CrossRef Google Scholar

Stöckl, T., Huber, J., & Kirchler, M. (2010). Bubble measures in Experimental Asset Markets. Experimental Economics, 13 (3), 284–298.10.1007/s10683-010-9241-9CrossRef Google Scholar

Sutan, A., & Willinger, M. (2009). Guessing with negative feedback: An experiment. Journal of Economic Dynamics and Control, 33 (5), 1123–1133.10.1016/j.jedc.2009.01.005CrossRef Google Scholar

Woodford, M. (2003). Interest and prices: Foundations of a theory of monetary policy. Princeton University Press.Google Scholar

Table 1. Hypothetical submitted forecasts

Fig 1. Aggregate price $P_{t}$ under rational expectation

Note: The solid vertical lines indicate when the two shocks occur, while the dotted lines in the left panel show when these shocks are announced.

Table 2. Number of groups per treatment

Fig 2. Realized aggregate prices $P_{t}$ (a) Positive Feedback: $\beta=0.9$ (b) Weak Negative Feedback: $\beta=-0.9$ (c) Strong Negative Feedback: $\beta=-1.8$

Fig 3. Deviations from the rational expectation prices (a) Relative Absolute Deviation (RAD) (b) Relative Deviation (RD)

Note: p-values of Kruskal-Wallis test are reported.

Fig 4. Forward-lookingness of expectations: case without announcement (a) Distributions of Forecast Revisions After the First Announcement $\Delta f_{1}^{i}$ (b) Distributions of Forecast Revisions After the Second Announcement $\Delta f_{2}^{i}$

Fig 5. Forward-lookingness of expectations: case with announcement (a) Distributions of Forecast Revisions After the First Announcement $\Delta f_{1}^{i}$ (b) Distributions of Forecast Revisions After the Second Announcement $\Delta f_{2}^{i}$

Fig 6. Classification of participants based on depth of reasoning

Fig 7. Forward-lookingness of expectations (a) Empirical Distribution of Forecast Revisions After the First Announcement $\Delta f_{1}^{i}$ (b) Empirical Distribution of Forecast Revisions After the Second Announcement $\Delta f_{2}^{i}$

Table 3. Average $R^2$ and classification of types across treatments and forecast horizons

Table 4. Conditional means and standard deviations of coefficients

Table 5. Summary statistics of interaction terms by treatment

Article contents

An experiment on a multi-period beauty contest game

Abstract

Keywords

Information

1. Introduction

2. A model of a multi-period beauty contest

3. Experimental design

3.1. Setup

3.2. Treatments

3.3. Procedures

4. Benchmark analysis: rational expectations equilibrium

5. Analysis of the experimental data

5.1. Overview of the experimental sessions

5.2. Aggregate price dynamics

5.3. Effect of the degree of strategic interaction

5.4. Forward-lookingness of expectations

5.5. Individual expectation formation

6. Concluding remarks

Acknowledgements

Footnotes

References

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests