SHARP TEST FOR EQUILIBRIUM UNIQUENESS IN DISCRETE GAMES WITH PRIVATE INFORMATION AND COMMON KNOWLEDGE UNOBSERVED HETEROGENEITY

This paper proposes a test of the single equilibrium in the data assumption commonly maintained when estimating static discrete games of incomplete information. By allowing for discrete common knowledge payoff-relevant unobserved heterogeneity, the test generalizes existing methods attributing all correlation between players’ decisions to multiple equilibria. It does not require the estimation of payoffs and is therefore useful in empirical applications leveraging multiple equilibria to identify the model’s primitives. The procedure boils down to testing the emptiness of the set of data generating processes that can rationalize the sample through a single equilibrium and a finite mixture over unobserved heterogeneity. Under verifiable conditions, this testable implication is generically sufficient for degenerate equilibrium selection. The main identifying assumption is the existence of an observable variable that plays the role of a proxy for the unobservable heterogeneity. Examples of such proxies are provided based on empirical applications from the existing literature.


INTRODUCTION
Economic models of strategic interactions among agents often admit multiple equilibria. Multiplicity of equilibria in the model may be seen as an economic problem and some equilibrium refinement can be used to determine which equilibrium or which equilibria should be considered. In empirical games, multiplicity of equilibria in the data generating process is an econometric issue that must be taken into account when trying to recover the model's primitives. In fact, identification arguments available in the literature differ according to the assumptions maintained on the number of equilibria realized in a given sample. Testing for a single equilibrium being realized in the data, an assumption also stated as the equilibrium selection mechanism being degenerate, is therefore desirable to guide applied researchers toward an appropriate estimation approach. Furthermore, assumptions on the number of equilibria realized in the data have different implications depending on the information structure of the game, i.e., whether the unobservables from the econometrician's point of view are private information or common knowledge among players. It follows that tests allowing for both private information and common knowledge unobservables may be preferable in practice.
This paper provides a test of the single equilibrium in the data assumption, allowing for both private information and discrete common knowledge payoffrelevant unobservables (henceforth unobserved heterogeneity, for short). 1 Since the null hypothesis of interest depends on unobserved heterogeneity, it cannot be directly tested. The current paper derives sharp testable implications of this null hypothesis. The identification argument is nonparametric: no parametric assumptions are needed for the payoff functions, nor the distribution of private information shocks; but the distribution of common knowledge unobserved heterogeneity and the equilibrium selection mechanism are assumed to have discrete supports. More precisely, the observable joint distribution of players' decisions can be written as a finite mixture and partial identification results from Henry, Kitamura, and Salanié (2014) are used to derive sharp bounds for the distributions defining this finite mixture. Leveraging results from Kasahara and Shimotsu (2014), a degenerate equilibrium selection mechanism imposes further restrictions on these distributions. The identified set constructed from all these restrictions being nonempty is a testable implication of the null hypothesis of equilibrium uniqueness in the data generating process. This paper also provides conditions under which this testable implication is generically sufficient for degenerate equilibrium selection, i.e., the set of data generating processes for which the testable implication fails to be sufficient has Lebesgue measure zero. Moreover, the conditions defining the identified set are such that one can test the hypothesis of discrete unobserved heterogeneity separately from the equilibrium uniqueness assumption. The identification result relies on the existence of an observable variable that can be interpreted as a proxy for the common knowledge unobservable heterogeneity. It must (i) have sufficient variation; (ii) be correlated with these common knowledge unobservables; and (iii) provide only redundant information about players' decisions and the equilibrium selection if such unobservables were actually observed. The test is implemented through Chernozhukov, Lee, and Rosen's (2013) intersection bounds approach and simulation results suggest that it performs well. 1 "Common knowledge unobservables" is often used in the literature on empirical games to describe information that is known to all players, but not to the econometrician. From a game theory point of view, these unobservables are "public information." In this paper, the former is preferred to avoid any confusion with existing work on empirical games.
How one treats multiple equilibria typically depends on how one is willing to interpret the information unobservable to the econometrician. 2 In games where one assumes that all unobservables are known to all players, i.e., games of complete information, set-identified estimators have been proposed to recover the set of model's primitives that can rationalize the data for any possible equilibrium selection mechanism (e.g., Tamer, 2003;Ciliberto and Tamer, 2009;Beresteanu, Molchanov, and Molinari, 2011;Galichon and Henry, 2011;Kline and Tamer, 2016;etc.). In that sense, such estimation methods are robust to multiplicity of equilibria. In contrast, many estimation methods ask the econometrician to take a stance on whether or not there are multiple equilibria in the data when estimating games of incomplete information, i.e., games assuming that unobservables are players' private information. On the one hand, it is often assumed that the data have been generated by a single equilibrium at least for some state variables (e.g., Aguirregabiria and Mira, 2007;Bajari, Benkard, and Levin, 2007;Pakes, Ostrovsky, and Berry, 2007;Pesendorfer and Schmidt-Dengler, 2008;Bajari et al., 2010;Aradillas-López, 2012;Lewbel and Tang, 2015;etc.). On the other hand, multiple equilibria realized in the data provide an extra source of variation that helps to identify the primitives of the model (e.g., Sweeting, 2009;de Paula and Tang, 2012;Aradillas-López and Gandhi, 2016). Multiple equilibria are therefore a valuable alternative to commonly used player-specific exclusion restrictions when the latter are not available in practice. To leverage this source of variation, one must provide evidence that there are multiple equilibria being realized in the data by rejecting the degenerate equilibrium selection mechanism assumption.
According to the single equilibrium in the data assumption, every time the same players play the same game, the same equilibrium is realized. For example, consider a game of market entry between two players, firm A and firm B. Suppose that this game has the following two equilibria: either A is more likely to enter the market than B, or vice versa. Such equilibria may arise in markets that are typically too small to justify simultaneous entry. In an econometric study of the entry behavior of A and B, one would typically observe firms' entry decisions in several markets. The single equilibrium in the data assumption states that if A is more likely to enter than B in one specific market, then it also has to be more likely to enter than B whenever the same game is realized in another market. This assumption is maintained even if B being more likely to enter than A is also sustainable in equilibrium.
Of course, the single equilibrium in the data assumption substantially simplifies the estimation by avoiding the need to solve for all the model's equilibria: the only relevant equilibrium is the one realized in the data and it can therefore be estimated. However, if the assumption is falsely maintained, the resulting estimates are associated with a mixture of equilibria, which is typically not an equilibrium in itself. Some tests of the single equilibrium in the data assumption have been proposed in the literature. 3 Two different approaches can be distinguished. The first one includes tests from de Paula and Tang (2012), Hahn, Moon, and Snider (2017) and Xiao (2018). These tests treat correlation in players' decisions as evidence against the single equilibrium in the data assumption. In other words, they require players' decisions to be mutually independent after controlling for observable common knowledge information and the selected equilibrium. This requirement implies that all unobservables from the econometrician's point of view are players' independent private information. However, an alternative explanation of correlation in players' decisions would be that the unobservables interpreted as private information shocks are actually, at least partially, observed by competitors (e.g., Navarro and Takahashi, 2012). Therefore, in such tests, common knowledge unobservable heterogeneity is ruled out by assumption and, more importantly, may lead to the false rejection of the degenerate equilibrium selection hypothesis.
The second approach has the advantage of allowing for common knowledge unobserved heterogeneity. The test proposed here belongs to this category. Aguirregabiria and Mira (2019) and de Paula and Tang (2020) are the papers most closely related to the current one. A detailed discussion of each paper's respective contribution is given in Section 2. At this point, it is worth mentioning the main distinctions. As opposed to Aguirregabiria and Mira (2019), the test introduced below does not require estimating the payoff functions based on commonly used player-specific exclusion restrictions to separate multiple equilibria from common knowledge unobservable heterogeneity. 4 This distinction is relevant for an applied researcher who does not observe such exclusion restrictions, but instead hopes to use multiple equilibria to identify the model's primitives (as in Sweeting, 2009;de Paula and Tang, 2012;Aradillas-López and Gandhi, 2016). Of course, this advantage is not for free. The test proposed in the current paper requires observing a proxy for the common knowledge unobserved heterogeneity. In that sense, it trades exclusion restrictions for a proxy variable. Examples of suitable candidates of proxies are discussed in Section 4. The recent working paper by de Paula and Tang (2020) also avoids the need to estimate payoffs, but requires the researcher to group realizations of the game into clusters within which equilibrium selection is correlated if there are multiple equilibria. This grouping, typically based on some observables, is not needed in the test proposed here. 5 The approach proposed in the current paper should be interpreted as a joint test of two assumptions: the single equilibrium in the data assumption and the 3 The focus of the current paper is on static games. For dynamic games, Otsu, Pesendorfer, and Takahashi (2016) and Luo, Xiao, and Xiao (2022) have proposed tests which will be further discussed below. 4 In that sense, the proposed test is aligned with a recommendation made by Sweeting (2009, p. 740): "[...] allowing for multiple equilibria can significantly increase computational costs [...] and relying on multiple equilibria for identification may make the results even more dependent on the correct specification of the model. For these reasons, I see particular value in future research [...] aimed at developing tests for the possible presence of multiple equilibria without requiring the full model to be estimated." 5 More precisely, a grouping based on unobservables is built in the finite mixture identification results. finite mixture representation of the unobserved heterogeneity. In particular, the test assumes the presence of common knowledge unobserved heterogeneity in the data generating process. An appealing by-product of the method is that the presence of unobserved heterogeneity can be separately tested without making assumptions on the number of equilibria realized in the data. I am not aware of the existence of such a test in the literature. In principle, one could therefore first test the relevance of unobserved heterogeneity to decide whether it should be taken into account when testing the single equilibrium in the data assumption in a second step. While this sequential approach would lead to possible pre-testing issues, providing a statistical method robust to these concerns is outside the scope of the current paper. In fact, this pre-testing problem would arise regardless whether one is using the method proposed here or another existing test of the single equilibrium in the data assumption in the second step. Alternatively, one may first use the joint test from the current paper and then, if this test rejects and one is worried that it may be due to the absence of unobserved heterogeneity, one could separately test the latter. Proceeding in this order does not de facto require a pre-testing correction in the case where the joint test is not rejected.
The rest of the paper is organized as follows. Related literature is summarized in Section 2, with special attention being paid to some useful results on the nonparametric identification of finite mixtures. A static discrete game with simultaneous decisions is introduced in Section 3. The nonparametric identification results and the statistical test are respectively presented in Sections 4 and 5. Monte Carlo simulations are reported in Section 6. Section 7 concludes. Proofs and further details are included in Appendixes.

RELATED LITERATURE
As mentioned above, a few papers from the literature on static games propose tests of equilibrium uniqueness in the data generating process. These tests are usually obtained as by-products of identification results. While these identification results provide great insights, some caveats about the corresponding tests have already been pointed out in Section 1. In fact, many of the existing tests (e.g., de Paula and Tang, 2012;Hahn et al. 2017;Xiao, 2018) require players' equilibriumspecific decisions to be independent given the observable information, hence ruling out common knowledge unobserved heterogeneity. In a static game of pure incomplete information, this simply follows from the conditional independence of unobservable private shocks. In this setting, testing for equilibrium uniqueness boils down to testing whether players' decisions are conditionally independent.
The same conditional independence is also key to use recent nonparametric identification results from the literature on finite mixtures and measurement errors 6 (e.g., Hall and Zhou, 2003;Hu, 2008;Shimotsu, 2009, 2014;Bonhomme, Jochmans, and Robin, 2016; and, the related but somewhat different identification arguments in Hu and Shum, 2012). In games of pure incomplete information, one can use results from this literature to identify a lower bound on the number of equilibria occurring in the sample. This lower bound corresponds to the number of components in the joint distribution of players' decisions represented as a finite mixture over the multiple equilibria. This is the approach proposed by Xiao (2018).
Unfortunately, as mentioned above, even if there is a single equilibrium in the data, conditional independence breaks down if players also take into account payoff-relevant information that is known to all of them, but unobservable to the econometrician. In such cases, tests based on conditional independence cannot be applied. 7 The main issue is that, if one finds the correlation between players' decisions to be nonzero or if one finds more than one component in the finite mixture representing choice probabilities, it could either be due to multiple equilibria in the data and/or common knowledge unobserved heterogeneity. In other words, such unobservables may lead to the false rejection of the single equilibrium in the data hypothesis.
Some progress has been made to allow for both private information and common knowledge unobserved heterogeneity in empirical games. In particular, Grieco (2014) proposes a parametric model that he estimates using data on grocery stores entry and exit. His results suggest that failing to include both private and common knowledge unobservable information may generate misleading results. More recently, Magnolfi and Roncoroni (2022) propose an estimation method applicable to games defined via an alternative equilibrium concept (Bayes Correlated Equilibrium developed by Bergemann and Morris, 2016) that allows for very weak assumptions on the information structure of the game. 8 In their empirical application, they also find that assumptions maintained on the information structure have an important impact on parameters' estimates and counterfactual predictions. Such recent practical insight justifies the need to extend tests of equilibrium uniqueness beyond the pure incomplete information setting and to allow for another source of possible correlation between players' decisions not due to multiple equilibria being realized in the data.
A few papers propose semi-parametric identification results that allow for multiple equilibria and common knowledge unobserved heterogeneity in static 7 Xiao (2018) briefly discusses the idea of using her identification result to test for common knowledge unobserved heterogeneity at the end of her Section 2.2. However, this procedure cannot be used to test for equilibrium uniqueness conditional on common knowledge unobserved heterogeneity. 8 See Syrgkanis, Tamer, and Ziani (2018) for an econometric application of the same solution concept to auctions.
While the information structure used in Magnolfi and Roncoroni (2022) is more flexible than the one considered in the current paper, point identification of the structural parameters under Bayes Correlated Equilibrium requires player-specific regressors which are not needed to perform the test proposed in the current paper. Moreover, the identification argument using multiple equilibria in the data leveraged in Sweeting (2009), de Paula andTang (2012), and Aradillas-López and Gandhi (2016) are based on Bayesian Nash equilibria. The information structure considered here is more restrictive than Magnolfi and Roncoroni (2022), but still more flexible than games of pure incomplete information.
games of incomplete information. One of them is Khan and Nekipelov (2018) who show that the linear strategic interaction parameters are identified even if continuous unobserved heterogeneity is drawn from an unknown distribution. However, the identification results are obtained under the assumption that the equilibrium selection mechanism is uniform over the equilibria of the model and therefore does not vary with the common knowledge state variables. When testing the single equilibrium in the data assumption, it is preferable not to make such an assumption on the equilibrium selection mechanism. Moreover, their identification result requires observing large realizations of player-specific regressors. As mentioned above, testing the single equilibrium in the data assumption is especially desirable when such regressors are not available.
Another paper is Magesan (2018) who shows identification of the payoffs provided that the equilibrium selection mechanism is degenerate for some realizations of the game. In that sense, the test proposed here is complementary to his identification results.
As mentioned above, Aguirregabiria and Mira (2019) is closely related to the current paper. Their main proposition follows from a sequential identification argument which combines results from the literature about nonparametric identification of finite mixtures. In a first step, they identify the nonparametric distribution of a discrete random variable with finite support that summarizes the information of the common knowledge unobservable heterogeneity and the unobservable variable that indicates which equilibrium is realized. Using the property that the equilibrium selection variable is payoff-irrelevant, the distribution of the unobservable heterogeneity and the equilibrium selection can be separately identified. In their setting, the number of equilibria corresponds to the cardinality of the support of the unobservable variable that selects the equilibrium realized in the data.
The sequential approach in Aguirregabiria and Mira's (2019) main identification result may be problematic if one is solely interested in testing equilibrium uniqueness. There are two limitations worth pointing out. First, in their setting, one must estimate payoff functions to separate common knowledge unobserved heterogeneity from the variable indexing which equilibrium is realized. Doing so typically requires player-specific exclusion restrictions, i.e., variables that only affect a given player's decision through its beliefs about its competitors' behavior (e.g., Bajari et al., 2010). However, one important motivation for testing equilibrium uniqueness is when such exclusion restrictions are not available and one would like to leverage multiplicity of equilibria as an alternative source of variation to identify payoffs. In such cases, one cannot apply Aguirregabiria and Mira's (2019) identification results to test for equilibrium uniqueness.
A second limitation, which also applies to other tests based on finite mixtures, is that the finite mixture framework restricts the number of components identifiable from the data. This restriction may not be innocuous. The largest number of components that one can identify in the first step of Aguirregabiria and Mira's (2019) sequential argument is given by the number of alternatives available in the players' choice set, raised to the power N−1 2 , where N is the number of players and · is the floor function. As a result, no mixture would be identifiable in a game of market entry between two players. Aguirregabiria and Mira (2019) also propose nonsequential results which require the exclusion restrictions commonly used in empirical games to be sufficiently over-identifying. While their nonsequential approach has some advantages (e.g., it may allow for N = 2), it is still not applicable without variables satisfying these exclusion restrictions.
Of course, similar restrictions on the identifiable number of components are likely to arise in other identification arguments based on a finite mixture. The approach proposed here is no exception, but the conditions imposed here are less restrictive. A considerable advantage of the current paper's procedure is that the number of equilibria is not restricted prior to the test. The restriction only affects the support of the common knowledge unobservable heterogeneity.
The second paper most closely related to the current one is the recent paper by de Paula and Tang (2020). They also propose testable implications for the single equilibrium in the data assumption which avoid having to estimate payoffs. This paper extends de Paula and Tang (2012) by allowing for players' private information to be correlated possibly via continuous unobserved heterogeneity. For static games, the test assumes that the researcher can group realizations of the game into clusters within which equilibrium selection is correlated if there are multiple equilibria. In other words, the arguments from de Paula and Tang (2012) apply within clusters which are defined by the researcher based on observables. To some extent, the finite mixture identification results leveraged in the test proposed below can be interpreted as a clustering which is based on unobservables that are left unspecified by the researcher (up to the support being finite and discrete) and other observable characteristics. Also, while the test proposed by de Paula and Tang (2020) is applicable to binary decisions, the test in the current paper allows for an arbitrary number of discrete choices.
At this point, it is worth distinguishing between two different approaches that have been proposed in the literature about nonparametric identification of finite mixtures: (i) the conditional independence, and (ii) the exclusion restriction (which will be interpreted as a proxy variable) approaches. In both cases, the main objective is to identify the number of components, the conditional component distributions, and the mixing probability weights.
In the conditional independence approach, the joint distribution of observed variables conditional on the latent mixing variable can be factored as the product of its marginals. A system of equations is constructed by considering different subvectors of the vector of mixed variables. For instance, in the context of a game with N players, one could consider the joint distributions of all subsets of players' decisions. Point identification is reached if one can construct enough equations to identify all the corresponding marginal conditional component distributions and mixing probability weights. The conditional independence approach has been used by, among others, Hall and Zhou (2003), Shimotsu (2009), andBonhomme et al. (2016). This conditional independence is also needed to identify the number of mixtures when using Kasahara and Shimotsu's (2014) results.
Alternatively, the exclusion restriction approach assumes that there exists an observable variable with sufficient variation that affects the mixing probability weights, but not the conditional component distributions. 9 With this restriction, one can write the conditional component distributions and the mixing probability weights in terms of some set-identified parameters. Henry et al. (2014) introduced this approach and they also provided an extensive discussion suggesting that the required exclusion often arises naturally in applied work. An especially relevant feature of this alternative approach is that the joint distribution conditional on the mixing variable does not have to be factorable in the product of its marginals. This is key in the context of the current paper: because the observable conditional choice probabilities are potentially mixed over multiple equilibria and common knowledge unobserved heterogeneity, players' decisions may fail to be independent after controlling for only one of these two types of unobservables.
It should be emphasized that the main object of interest in the current paper is the number of equilibria in the data generating process, which form a subset of the equilibria in the model. In that sense, the objective is very different from other work, such as Kasy (2015), focusing on the number of equilibria in the model. In fact, when applying his inference method to a game of incomplete information, Kasy (2015) first estimates the model using a two-step approach. Such two-step estimation relies on the single equilibrium in the data assumption, which can be tested using the method proposed here. Another related paper is Espin-Sanchez, Parra, and Wang (2022) who provide a testable condition that guarantees equilibrium uniqueness given estimates of the payoffs. Once again, estimating payoffs requires first taking a stance on whether a single or multiple equilibria are realized in the data.
Finally, it is worth pointing out that some progress has recently been made in testing for equilibrium uniqueness with common knowledge unobserved heterogeneity in dynamic games (see, for instance, Otsu et al., 2016, Sect. 3.5;Luo, Xiao, and Xiao, 2022;de Paula and Tang, 2020, Sect. 3). A nice feature of the dynamic setting is that the econometrician typically observes each market for more than one period, which gives some traction when trying to separate timeinvariant unobserved heterogeneity from multiple equilibria. Unfortunately, this extra dimension, i.e., time, is often not available in the static case. For this reason, the test proposed in the current paper is not directly comparable with tests for equilibrium uniqueness in dynamic games. However, it is interesting to note that, even with this extra dimension, tests applicable to dynamic games also focus on discrete unobserved heterogeneity. 9 Notice that this exclusion restriction does not correspond to the one commonly used to identify payoff functions as in Bajari et al. (2010). This is partly the reason why "proxy" will often be preferred to "exclusion restriction" when referring to this approach in the current paper, even if the identification result is in fact based on an exclusion restriction.

A STATIC DISCRETE GAME WITH SIMULTANEOUS DECISIONS
This section describes the economic model, i.e., the game as it is played by the players, and its econometric counterpart. Players' decisions are contingent on some state variables, which are separated into two categories depending on whether they are observed by all players. Let S = S 1 ,S 2 ,...,S N with realizations s = s 1 ,s 2 ,...,s N ∈ S N be some information that is common knowledge to all players. Furthermore, let E = E 1 ,E 2 , . . . ,E N with realizations ε = ε 1 ,ε 2 ,...,ε N such that ε i = [ε i1 , . . . ,ε iJ ] ∈ R J be some private information. Let G E i (·) denote the cumulative density function of E i . Because player i's opponents do not observe ε i , this is a game of incomplete information.

Economic Model
Let π i (·) : Y N × S × R → R be player i's payoff function. While the payoff of player i choosing y i = 0 is normalized to 0, the payoff when choosing y i = j = 0 is denoted by π i j,y −i ,s i ,ε ij , where y −i refers to the decisions of player i's opponents. The following assumption, which is common in the literature, is maintained on the payoff functions and the distributions of S, E.
Assumption 1 (State variables and payoffs). (i) S, E 1 , E 2 , ..., E N are mutually independent. (ii) G E i (·) ∀i ∈ N are common knowledge to all players and they are absolutely continuous with respect to the Lebesgue measure on R J . (iii) π i (·) ∀i ∈ N are common knowledge to all players.
It is worth mentioning that Assumption 1(i) is actually stronger than what will be required by the statistical test. Since the test will be performed after conditioning on a subvector of realizations of S, E is only required to be independent of the remaining elements in S. A more precise version of Assumption 1(i) is given in Assumption 2(iv).
The timing of the decision process is as follows. First, s and ε are realized. Even if players do not observe their opponent's private information, they can still form beliefs about their competitor's decision under Assumption 1. Then, all players simultaneously decide, i.e., y is realized and commonly observed. To sum up, at the time of the simultaneous decisions, player i's information set is (1) Player i's strategy is a function that maps the information set, I i , to the choice set, i.e., σ i (·) : I i → Y . For a given strategy, the conditional choice probability of player i choosing j at a given s ∈ S is which can be interpreted as the beliefs of player i's opponents regarding player i's decision, when player i behaves according to strategy Let π p i j,s,ε ij denote the expected payoff of player i choosing y i = j, which is computed by integrating out y −i using the corresponding elements of p (s). If each player's strategy is to maximize expected payoffs, (2) is equivalent to The right-hand side of equation (3) is the best response mapping of player i given its beliefs regarding its opponents' decisions. Let ψ ij (s,·) denote this mapping, collect choice-specific mappings in ψ i (s,·) = [ψ i1 (s,·),ψ i2 (s,·),...,ψ iJ (s,·)] and define where (s,·) : [0,1] JN → [0,1] JN is a mapping in the probability space. Given (s,·), one can define a Bayesian Nash Equilibrium (BNE) in pure strategies. Defining a BNE in the probability space is very convenient to analyze equilibrium existence and multiplicity (Milgrom and Weber, 1985).
Definition 1 (BNE in probability space). A pure strategy BNE in the probability space is a set of conditional choice probabilities p (s) such that p (s) = (s,p (s)).
Definition 1 simply states that, in equilibrium, players' beliefs are consistent with their opponents'. In fact, a BNE in the probability space is a fixed point of the best response mapping. Since (s,·) maps a compact set to itself and since it is continuous in p (s), the existence of an equilibrium follows from Brouwer's fixed-point theorem for any s ∈ S . However, uniqueness is not guaranteed. Let T (s) be the set collecting all equilibria that the model admits at a given s.
In order to fix ideas, it is worth introducing a running example that will be used as a data generating process to illustrate several concepts and results throughout the paper. This example is a simple static game of market entry between two firms.
Example 1 (Simple game of market entry). Consider two firms deciding whether they want to operate in a given market, such that y i = 1 if firm i operates in the market and y i = 0 otherwise. In this case, S could be some common knowledge information about market size and consumers' preferences. Moreover, E could refer to some private information cost shifters such as managerial ability. Let players' payoffs when operating in the market be and player i's payoffs when y i = 0 are normalized to 0. Furthermore, let ε = [ε 11 ,ε 21 ] be drawn from Normal (0,I 2 ), where I 2 is the 2 × 2 identity matrix. Using (·) to denote the standard normal cumulative density function, the best response mapping is (s,p (s)) = (s 1 − 4p 2 ( 1| s)) (s 2 − 3p 1 ( 1| s)) .
(7) Figure 1 is the graphical representation of the best response mapping in (7) for all p (s) ∈ [0,1] 2 . The BNE(s) are given by the intersection(s) of the best response functions. This figure clearly illustrates that, for a given set of primitives, different realizations of S are associated with different BNE's and in particular with different numbers of BNE's.

Econometric Model
The game described so far is the economic game as it is played by the players. Consider now the econometric game, i.e., the game as it is observed by the econometrician. An important difference between the two is that the researcher only observes some of the common knowledge payoff-relevant state variables in S. The following assumption is maintained on S.
Typically, V can be thought of as a vector of common knowledge unobservable heterogeneity. 11 It is written as a vector-valued random variable to emphasize that it may capture different forms of unobserved heterogeneity. It could be a vector of player-specific unobservables or a scalar capturing unobservables that do not vary across players.
Finite and discrete unobserved heterogeneity is arguably a natural first step toward allowing for common knowledge unobserved heterogeneity when testing the single equilibrium in the data assumption. Currently available tests would reject the null hypothesis of a single equilibrium in the data if players' decisions were sufficiently correlated conditional on observed state variables. The test proposed here can be interpreted as a device to check whether sufficient correlation remains when allowing for some common knowledge unobserved heterogeneity. In that sense, allowing for a fixed number of possible realizations of unobservables (without constraining their distribution) is a crude, but useful approximation to detect correlation due to such unobservable information.
One appealing feature of the discrete unobserved heterogeneity in the current setting is that the support V (x) is allowed to vary with x, such that different realizations of the game are allowed to have different realizations of discrete unobserved heterogeneity. Further, |V (x)| is identifiable from the data and the method allows the researcher to check whether the data reject assumptions maintained on the support of V (x), separately from the equilibrium uniqueness hypothesis. More details are provided in Section 4.
In order to appreciate the relevance of using discrete unobserved heterogeneity, it is worth discussing alternative approaches proposed in the literature to allow for common knowledge unobserved heterogeneity in static discrete games of incomplete information. First of all, several empirical applications ignore such unobserved heterogeneity by assuming that all unobservables from the point of view of the econometrician are private information. Of course, when the data has a panel structure in which realizations of the game are observed along more than one dimension (e.g., multiple geographical markets observed over multiple periods of time in a game of market entry), introducing fixed effects may capture some of the unobservable information known to all players. Unfortunately, this option is not possible if one observes a single cross section of replications of the game.
One alternative proposed in the literature has been to introduce some parametric common knowledge unobserved heterogeneity. For instance, in some of his specifications, Sweeting (2009) allows for player-specific unobserved heterogeneity constant across repetitions of the game which is drawn from a distribution known up to some parameters and enters players' payoffs additively. Another example is Bajari et al. (2010) who assume that common knowledge unobserved heterogeneity is simply a function of observable state variables.
The current approach does restrict the support of the common knowledge unobserved heterogeneity to be finite and discrete, which is not the case of the specifications proposed by Sweeting (2009) and Bajari et al. (2010). However, it has two considerable advantages. First, it does not require distributional assumptions beyond the support being finite and discrete. Second, the researcher does not have to specify how this common knowledge unobserved heterogeneity enters players' payoffs. In fact, V simply defines different components of a finite mixture representation of players' choice probabilities, without having to specify an expression for the distribution of players' decisions conditional on ν.
It should be emphasized that finite mixture representations of common knowledge unobserved heterogeneity have been used when estimating dynamic discrete games. Both Aguirregabiria and Mira (2007) and Igami and Yang (2016) show that allowing for a finite number of points in the support of the common knowledge unobserved heterogeneity is able to capture confounding factors which, when ignored by the researcher, substantially bias strategic interaction estimates.
Notice that X being finite and discrete is also maintained by de Paula and Tang (2012) among others. While not essential in theory, this assumption is very convenient in practice since different realizations of X can be associated with different numbers of equilibria both in the model (see Figure 1) and in the data. This is the reason why the identification result holds for fixed realizations of X . If X is continuous, one must take into account potential discontinuities in choice probabilities conditional on realizations of X that may be introduced by different numbers of equilibria across these realizations. 12 Going back to the running example, |V (x)| = 2 can be used to capture relevant unobservable information in a simple game of market entry.
Example 1 (Simple game of market entry, continued). In the simple game of market entry between two firms, V could capture some information related to unobservable demand and/or cost shifters. While both firms have potentially gathered information about these shifters, such information may remain unobservable to the econometrician or may be impossible to measure accurately. In this setting, even the simplest possible case allowing for two elements in the support of V helps to better interpret firms' decisions.
Let V (x) = ν 0 ,ν 1 . The interpretation of ν 0 and ν 1 depends on how they affect firms' probabilities of entering the market. For instance, p i 1| x,ν 0 < p i 1| x,ν 1 for i = 1,2 would suggest that if V = ν 1 , both firms are more likely to enter the market than if V = ν 0 . In other words, conditional on x, V potentially reflects some market-level common knowledge unobserved heterogeneity. Alternatively, suppose that p 1 1| x,ν 0 < p 1 1| x,ν 1 and p 2 1| x,ν 0 > p 2 1| x,ν 1 . One way to interpret V is that it captures some player-specific common knowledge unobserved heterogeneity that has a very different effect on firms' probabilities of entering the market given x. While ν 0 decreases firm 1's probability of entry relative to ν 1 , it increases firm 2's.
Finally, since the game may admit multiple equilibria, the econometric model requires an equilibrium selection mechanism assessing the probability that a given equilibrium is realized in the data. Another important difference between the game played by the players and the game observed by the econometrician is that not all equilibria in the model are necessarily realized in the data. Let T * (x,ν) ⊆ T (x,ν) be the subset of the model's equilibria that are realized in the data. Furthermore, let λ ( ·| x,ν) be the equilibrium selection mechanism given the payoff-relevant information observable to all players. Let T be a random variable labeling which equilibrium is played. More precisely, each equilibrium realized in the data is indexed by τ ∈ T * (x,ν). 13 The following assumption, which is also common in the literature (see for instance Aguirregabiria and Mira, 2019, Assum. 3, p. 1668) is maintained on T . It ensures that players' private information does not affect which equilibrium is realized and, consequently, they cannot infer their competitors' private information from the realized equilibrium. It also rules out continuum of equilibria.
The framework described via Assumptions 1-3 actually allows for quite flexible correlations between variables that are unobservable from the econometrician's point of view. There are two different vectors of payoff-relevant unobservables, i.e., E and V. Conditional on X , common knowledge unobservable heterogeneity introduces correlation between players' payoff-relevant unobservables via V even if E i are independent across players. Furthermore, V introduces correlation between payoff-relevant unobservables and the variable indicating which equilibrium is realized even if T is independent from E 1 , . . . ,E N .
Finally, the difference between T * (x,ν) and T (x,ν) is key to interpret the degenerate equilibrium selection assumption. In fact, it corresponds to the special case |T * (x,ν)| = 1 for all ν ∈ V (x). In other words, for a given x, λ ( τ | x,ν) = 1 for the unique τ ∈ T * (x,ν) for all ν ∈ V (x). While the economic model may admit multiple equilibria, this assumption requires the same equilibrium to be played whenever the same common knowledge payoff-relevant information is realized.

Data and Objects of Interest
When estimating an empirical game, the econometrician typically observes M realizations of the game. For each of these realizations, the data consist in {y m ,x m : m = 1, . . . ,M}. There are therefore three random variables that are unobservable from the point of view of the researcher: (i) the private information shocks E; (ii) the common knowledge variables V; and (iii) the variable indicating which equilibrium is realized in the data T . The assumptions maintained on the data generating process are formally stated in Assumption 4. Let p ( y| x,ν,τ ) denote the joint distribution of players' decisions conditional on x,ν,τ .

Assumption 4 (Data generating process
are M independent draws from the multinomial distribution with vector of probabilities of success defined according to p (·|x m ,ν m ,τ m ).
Assumptions 1 to 4 imply that, conditional on {x m ,ν m ,τ m } M m=1 , y im are independent across i and m. In fact, this conditional independence is all that is needed for a test of the single equilibrium in the data assumption based on the correlation between players' decisions after properly controlling for all players' common knowledge information.
Let p ( y| x) be the conditional joint distribution of players' decisions that is point identified from the data. Given Assumptions 1 to 4, such distributions are double finite mixtures of the equilibrium conditional choice probabilities realized in the data. For a given (y,x) ∈ Y N × X : where From equations (8) and (9), one can see that p ( y| x) is a double finite mixture with a total of T * x,ν l components. In the current setting, the main object of interest is the number of equilibria associated with each ν ∈ V (x) conditional on x, i.e., each element of the set . These objects of interest are used to test whether the single equilibrium in the data assumption holds at each realization of the finite and discrete support of the common knowledge unobserved heterogeneity. Formally, the null and alternative hypotheses are stated as

The Need for More Structure
The main challenge in testing the degenerate equilibrium selection mechanism while allowing for common knowledge unobserved heterogeneity is to separate each source of potential correlation in players' decisions, i.e., T and V, conditional on X . More precisely, it does not suffice to check the total number of components in the double finite mixture in (8) and (9), i.e., T * x,ν l . 14 Testing H 0 rather requires being able to check the number of components conditional on the realizations of V. The problem here is that H 0 depends on the unobservable heterogeneity and, therefore, cannot be tested directly.
To solve this problem, one must first put more structure on the double finite mixture of interest. The additional structure imposed in the proposed test is based on identification results from Henry et al. (2014). While details about the argument are presented below, one can interpret the main requirement of this approach, when applied to the current setting, as the availability of a proxy variable for the common knowledge unobservable heterogeneity. Such proxy variable allows one to separate the mixture over V from the mixture over T . More importantly, it allows one to derive testable implications of H 0 without needing to estimate payoff functions.

Identifying Restrictions and Proxy Variable
The additional structure needed for identifying the mixture over the common knowledge unobserved heterogeneity is now stated in terms of exclusion restrictions, i.e., similarly as in Henry et al. (2014). The proxy variable interpretation will be highlighted later on. Once again, it is worth emphasizing that the following exclusion restriction is different from the one commonly used for the identification of discrete games.
Let the vector of observable state variables, X , be divided into a subvector of variables that do not satisfy the exclusion restriction, X NE , and a subvector of variables that do satisfy it, X E , such that X = X NE ,X E , with realizations Assumption 5 formally states the identifying restrictions.
(v) (Sufficient variability in mixture weights) for an appropriate subsetX E , the square matrix with (i,j)th element given by The support independence condition implies that the set of values of the mixing variables realized with a positive probability does not vary with X E . This condition is important: in order to use variation in X E to identify the finite mixture over V, such variation should not generate changes in the set of possible realizations of the mixing variable. Slightly abusing notation, V (x NE ) will be used for the rest of the paper to make this condition explicit.
The condition on the cardinality of the support is included to make sure that there is enough variation in Y and in X E for the exclusion restriction approach to be able to capture all the relevant realizations of V. The rationale for min (J + 1) N , |X E | will become clear in Lemma 1. In a game where two players choose between two actions, the test allows for up to 2 2 = 4 different realizations of common knowledge unobserved heterogeneity, given x NE , provided that there is enough variation in X E . For a game where two players choose between three actions, there can be up to nine values of common knowledge unobserved heterogeneity. If three players choose between two actions, this number is 8. In other words, even for very simple games, the test allows for an appreciable number of realizations of common knowledge unobserved heterogeneity, which is exponentially growing in the number of players. As will be discussed in Section 4.4, applying the test to existing examples from the literature suggests that several unobservable realizations can be accommodated for when testing the single equilibrium in the data assumption.
The relevance condition requires the distribution of the unobservable V to depend on both X NE and X E . Notice that this condition does not contradict the support independence one as long as realizations of V do not become zero probability events for some realization of X E .
By the redundancy condition, the conditional choice probabilities, the equilibrium selection mechanism and the set of equilibria in the data generating process have to be independent of X E after conditioning on X NE and V. 15 Once again, slightly abusing notation, such independence is made obvious by using p ( y| x NE ,ν), p ( y| x NE ,ν,τ ), λ (τ |x NE ,ν) and T * (x NE ,ν). In other words, X E provides some information about the distribution of the unobservable V, but would not provide any information about players' decisions nor the equilibrium selection if V would be observable.
Finally, the assumption of sufficient variability in mixture weights and in unknown choice probabilities can be interpreted as rank conditions ensuring that the variation induced by X E and Y are sufficient to identify the mixture over V.

Examples of Proxy Variables
Of course, a natural question to ask at this point is whether variables satisfying restrictions stated in Assumption 5 are easy to find. Remember that V is observed by both players, but not by the econometrician. In some sense, one simply needs an observable variable that plays the role of a proxy for the unobservable common knowledge payoff-relevant variables. In particular, since one is not interested in quantifying the structural effects of such common knowledge unobserved heterogeneity per se, the test does not require precise measurements of these unobservables.
There are at least three categories of variables that can be used as proxies for common knowledge unobserved heterogeneity: imperfect measurements for suspected unobserved heterogeneity, number of players and predetermined outcomes.
The following examples, which are based on existing empirical applications, highlight the potential for multiple equilibria, the relevance of discrete unobserved heterogeneity and the proxies that could be used. 16 Depending on the empirical application, some of these three categories of variables may be more useful than others. For instance, if the researcher has a clear understanding of the type of common knowledge payoff-relevant unobserved heterogeneity that should be controlled for, imperfect measurements of this suspected unobserved heterogeneity may be quite appealing. As discussed below, there are 15 Notice that the redundancy condition implies independence between Y and X E after conditioning on X NE and V.
In that sense, one may find the exclusion restriction approach for the identification of finite mixtures to be somewhat similar to the alternative approach relying on the independence of some observable variables after conditioning on the latent variable. However, since the distribution of V depends on X E , one cannot use the independence between Y and X E to identify the mixture over V as in the conditional independence approach. 16 Henry et al. (2014), Sect. 1.2) discuss several examples of variables that can be used as proxies for unobservables in different empirical applications. Part of their discussion covers unobserved heterogeneity in structural microeconomic models. In their oligopoly pricing example, Henry et al. (2014, Appendix A) suggest using measures of realized profits, which are ex ante unknown to the players, to control for unobserved demand variables. Such proxies would indeed be informative about unobserved heterogeneity beyond X NE (relevance) as realized profits typically depend on common knowledge unobservables. However, it is not obvious that they can be applied in the current setting since realized profits would be informative about the equilibrium selection mechanism. several examples of empirical applications where imperfect measurements are included as regressors to control for unobserved heterogeneity. If the nature of the unobserved heterogeneity is either not obvious or hard to control for using imperfect measurements, the number of players or predetermined outcomes may be more interesting options.
Imperfect Measurements of Suspected Unobserved Heterogeneity. In empirical applications, researchers may suspect a specific source of unobserved (or even unmeasurable) heterogeneity to act as a confounding factor likely to bias the estimation. Including imperfect measurements of these unobservables is frequently done to control for such confounding factors at least partially. Commonly used examples from different fields of economics are test scores to control for cognitive abilities, income for well-being, value of houses for wealth, etc.
For instance, in a game of market entry, unobservable cost and demand shifters may make a market relatively more profitable than what is actually captured by observable payoff-relevant variables. When players' payoffs have a reduced form interpretation, i.e., when demand and supply are not explicitly modeled, one may therefore choose to include some imperfect measurements for these shifters.
If observable variables are suitable controls for the unobservables they are imperfectly capturing, they are likely to satisfy the conditions required for X E to be valid excluded regressors. While they are imperfect measurements of the suspected unobserved heterogeneity, they are typically strongly correlated with it and therefore satisfy the relevance condition. However, conditional on these unobservables (and X NE ), the imperfect measurements generally do not affect payoffs nor equilibrium selection. In other words, imperfect measurements should not be more informative about players' decisions nor the equilibrium being realized in the data than the actual unobservables (given X NE ). It follows that the redundancy condition is satisfied as well.
It is important to note that the variable X E is also allowed to be correlated with X NE (just like V possibly is). However, for an imperfect measurement to be a valid proxy for V, it must be such that it does not directly affect players' decisions conditional on X NE and V: the only reason why the imperfect measurement X E is informative about players' decisions (conditional on X NE ) is because V is not observed. Otherwise, this imperfect measurement should have been included in X NE .
Example 2 (Imperfect measurements of demand and cost shifters). Aradillas-López and Gandhi (2016) propose a novel estimator which uses correlation between players' decisions generated by multiple equilibria being realized in the data to estimate games of ordered actions. In their empirical application, they consider a game between the three main chains of retail pharmacies in the US: CVS (i = 1), Rite Aid (i = 2), and Walgreens (i = 3). Each player decides how many stores to operate in each geographical market (a core-based statistical area).
Aradillas-López and Gandhi (2016) justify the need to allow for multiple equilibria being realized in the data by the nature of the strategy space. They provide simulation evidence that the relative large choice set of the players implies that the model admits multiple equilibria. Their estimation method effectively detects some correlation between players' decisions which they interpret as evidence of multiple equilibria being realized in the data.
Of course, correlation between players decisions could also be generated by common knowledge unobserved heterogeneity. However, Aradillas-López and Gandhi (2016, p. 755) argue that, in this industry, there is no "obvious, compelling demand side unobservable at the market level (e.g., an unexplained taste for health) that cannot be conditioned out with observables (such as the number of doctors in the market)." The observables included in X in their estimation are: population, average income per household, population density, median age in the population, total number of business establishments, and the distance to the nearest distribution center of each player. The authors note that the number of business establishments is included "to control for supply side unobservables" such as costs related to zoning restrictions.
The method proposed here could be used to formally test for multiple equilibria being realized in the data using the total number of business establishments as X E to control for supply side unobservables such as unobserved zoning restrictions. While this imperfect measurement is correlated with the supply cost shifters suspected to introduce correlation between players' decisions, it is worth noting that it does not have a direct impact on pharmacies' payoffs nor the equilibrium being selected beyond the unobserved demand and cost shifters. For instance, it is not the total number of business establishments that makes a market more profitable, but rather zoning restrictions imperfectly measured by the number of business establishments, given all the other regressors included in X .
This source of suspected unobserved heterogeneity is an example of marketlevel unobservable. Suppose that each firm chooses between 0, 1, or 2 stores. Since, N = 3, the test allows for up to |V (x NE )| = 27 different types of markets at a given realization of X NE , provided that there is enough variation in X E .
It is worth emphasizing that introducing some imperfect measurements to control for unobservables is very common when estimating games. For instance, in her analysis of location choices in the video retail industry, Seim (2006, pp. 629-630) uses "business density as a catch-all proxy for the general business environment." When studying competition in the airline industry, Ciliberto andTamer (2009, p. 1808) "compute the sum of the geographical distances between a market's endpoints and the closest hub of a carrier as a proxy for the cost that a carrier has to face to serve that market," because "firm-and market-specific measures of cost are not available." In fact, in many empirical applications, geographical distance is often interpreted as some imperfect measurement of unobserved costs or scale economies. Another example is the distance between stores in Jia's (2008) analysis of competition between chain stores and discount retailers. Distance between stores is included in the payoff function because "nearby stores split the costs of operation, delivery, and advertising to achieve scale economies" (Jia, 2008(Jia, , p. 1276. In all these cases, the researcher has a clear interpretation of the unobserved heterogeneity to be controlled for. An important remark is worth making at this stage. If proxy variables (X E ) are already used to control for common knowledge unobserved heterogeneity (V) when estimating empirical games, what prevents one from testing whether there is a single component in the mixture over realized equilibria directly from p ( y| x NE ,x E ) instead of having to recover p ( y| x NE ,ν)? Presumably, one could work directly with p ( y| x NE ,x E ) if there was a one-to-one relationship between ν and x E , given x NE . While this one-to-one relationship may hold in some cases (e.g., using distance to proxy for transportation costs), it is far-fetched in others (e.g., total number of business establishments to control for supply side unobservables). When the relationship is not one-to-one, p ( y| x NE ,x E ) = p ( y| x NE ,ν), but the latter probability is needed to test the single equilibrium in the data assumption. Of course, this inequality is not problematic when one is solely interested in estimating payoffs: controlling for confounding factors does not require a oneto-one relationship between ν and x E , conditional on x NE .
When one does not have a clear a priori on the potential source of unobserved heterogeneity, one cannot find an imperfect measurement for such unobservables. For those cases, one could consider the following two other possible types of proxies that satisfy the conditions associated with X E .

Number of Players.
In some empirical applications, there may be variation in the number of players (e.g., potential entrants in a game of market entry) across different realizations of the game. Realizations of the game that are observationally equivalent in X NE , but are associated with different numbers of players, suggest that the researcher may fail to observe some information known to all players (relevance). A similar argument is often used to detect or to control for unobserved heterogeneity in auctions. 17 Under some conditions, the number of players does not affect the equilibrium of the game and is therefore not informative about which equilibrium will be realized beyond X NE and V (redundancy). This would be the case for games between symmetric players with strategic interactions depending linearly on the fraction of competitors making the same decision.
Formally, let the payoff of player i choosing j be which is identical across players (up to the private information shock ε ij ) and depends linearly on the fraction of the N − 1 competitors also choosing j, i.e., k =i 1 {y k = j} / (N − 1). Since players are symmetric, the probability of choosing j given x NE ,ν, i.e., p ( j| x NE ,ν), does not vary across players. Player i's expected payoff of choosing j therefore becomes which does not depend on N. Since the best response mapping defining a BNE in the probability space (see Definition 1) is a function of differences of expected payoffs, the fact that expected payoffs do not depend on N implies that the equilibrium (equilibria) of the game is (are) necessarily independent of N.
The following example borrows from the model of commercial radio stations' advertisement timing decisions in Sweeting (2009). In particular, the case of symmetric stations is discussed in Sweeting (2009, p. 722) to highlight the identifying power of multiple equilibria realized in the data. The assumption of symmetric players may be imposed in empirical applications for which playerspecific observables are either not available or do not induce significant variation in payoffs.
Example 3 (Varying numbers of symmetric players across realizations of the game). Consider a game in which N radio stations simultaneously choose between J + 1 possible time slots to air radio advertisements. The payoff of station i choosing j is For instance, x NE could include observable demographics of the geographic market in which radio stations compete and ν could represent some information known to all stations, but unobservable to the econometrician such as regional-specific listening habits. Since stations are identical up to their private information shock (ε ij ), p ( j| x NE ,ν) does not vary with i. Using the same argument as above, expected payoffs and stations' best responses do not depend on N which implies that the symmetric BNE (equilibria) is (are) also independent of N. The timing decision studied by Sweeting (2009) is in fact a coordination game. Radio stations have an incentive to air their commercial breaks simultaneously to discourage listeners from switching between different channels to avoid advertisements. The model therefore admits multiple equilibria and different equilibria may be realized in the data.
If the number of radio stations (N) varies across realizations of the game, it can be used as X E . If two different regional markets with the same realization of X NE have different N's, it could be because they differ in V (relevance). Moreover, payoffs being symmetric across players and depending linearly on the fraction of competitors making the same decision imply that variation in N does not affect the equilibrium of the game (conditional on the realized X NE , V). It follows that, if one could observe the realized V, N would only provide redundant information about stations' decisions and equilibrium selection (redundancy).
Since choice probabilities do not vary across players, the maximum number of points in the support of V is now given by min {J + 1, |X E |} instead of min (J + 1) N , |X E | . More details about this bound are available in the discussion following Lemma 1. It therefore linearly depends on the observed variation in N and the number of possible time slots (J + 1) considered in the analysis. In Sweeting (2009), J = 2 and, depending on market and station definitions, the average N varies between 4.9 and 13.3 stations per market. One could therefore identify up to three points of support in V.
One advantage of using the number of players as a proxy for the unobserved heterogeneity is that, as opposed to imperfect measurements, it does not require knowing what is the source of heterogeneity one would like to control for. As long as the unobserved heterogeneity known to all players induces variation in the number of players across replications of the game that have the same realizations of X NE , the number of players is informative about V regardless of what V captures. As mentioned above, this is also the reason why variation in the number of bidders is often used to control for common knowledge unobserved heterogeneity in auctions.
It is also worth emphasizing that the requirement of players being symmetric, which is needed for the redundancy condition to hold, is not particularly restrictive in the current setting. In fact, testing the degenerate equilibrium selection assumption is relevant in applications where one would like to leverage variation in realized equilibria to identify strategic interactions between players. This source of variation is especially useful when players' payoffs are symmetric (up to their private information shocks) since, in those cases, player-specific variation in payoffs is not observed by the researcher. In other words, the need to test for the single equilibrium in the data assumption is often justified by the symmetry in players' payoffs. Moreover, beyond the example from Sweeting (2009) mentioned above, strategic interactions depending linearly on the fraction of competitors making the same decisions is very common in empirical applications of peer effects and social interactions since Brock and Durlauf (2001, p. 239) who refer to such a specification as "proportional spillovers." Predetermined Outcomes. Suppose that there are some common knowledge payoff-relevant variables which have the same effect on payoffs regardless of players' decisions given X NE and V. Under some conditions, such variables are valid candidates for proxies. More precisely, let X E be such that, for realizations for j = k and ∀i ∈ N . In other words, the variation in payoffs induced by X E does not vary with players' choices. Rearranging the previous equation, one obtains Since this equality holds ∀i ∈ N , integrating over competitors' decisions implies that the difference does not depend on x E . As a result, equilibrium choice probabilities do not depend on x E given x NE and ν.
Variables X E satisfying (15) could be some observable outcomes related to players' previous decisions that are realized before the game is played and are taken as given in the game that is studied by the researcher. Such variables are predetermined outcomes. The key for (15) to be satisfied is that, even if those previous decisions do affect players' current payoffs, this effect must be the same regardless of the decisions made by the players. For instance, in games of market entry, predetermined outcomes could be previously acquired fixed assets that are in use no matter whether the firm decides to enter or not, and are not modifiable in the game of interest.
Predetermined outcomes should therefore be informative about common knowledge state variables available to the players, including information that is unobservable to the econometrician and therefore satisfy the relevance condition. Since, from equation (17), equilibrium choice probabilities do not depend on such predetermined outcomes, the redundancy condition is also satisfied if these variables do not affect equilibrium selection (given X NE and V).
Of course, the equilibrium selection being conditionally independent of the predetermined outcomes does put some restrictions on the type of variables that may be used as proxies. In particular, for predetermined outcomes to be valid, they should not be part of a multistage game where previous actions could be informative about players' decisions and equilibrium selection beyond X NE and V. For example, predetermined outcomes allowing players to signal their willingness to take specific future actions would not be applicable in the current setting. As suggested by Van Damme (1989) and Ben-Porath (1992), these signals could alter the equilibrium selection and would therefore not satisfy the desired exclusion restriction. To the extent that one can make the economic argument that the outcomes used as proxies are not predictive of which equilibrium of the game is realized (once again, given X NE and V), the conditions for valid proxies are satisfied. If one is able to convincingly argue that the proposed predetermined outcomes are not part of a multistage forward-looking decision process, they are unlikely to affect equilibrium selection (given X NE and V). In such cases, provided that (15) holds, X E can be used as proxies.
Below is an example of predetermined outcomes that can be used as proxies for common knowledge unobserved heterogeneity in a game of market entry between two players. This example is a simplification of the model proposed by Gowrisankaran and Krainer (2011).

Example 4 (Potential locations in an entry game). Gowrisankaran and Krainer
(2011) study a game of automatic teller machine (ATM) location and pricing decisions. There are two types of firms, banks, and nonbanks, deciding whether or not to install an ATM on their premises and what level of fees to charge to their ATM users. Consumers choose where to withdraw money based on ATM locations and fees. Potential locations for ATMs are existing banks and retail establishments (mostly grocery and convenience stores). Gowrisankaran and Krainer's (2011) estimation method assumes that the same equilibrium is played in different markets with identical observables. In their setting, a market is a rural county. Since rural areas are typically less densely populated than cities, different subsets of firms installing an ATM on their premises may be sustainable in equilibrium. The single equilibrium in the data assumption maintains that if banks are more likely than nonbanks to install an ATM in a given market, it must also be the case in all other markets that are similar according to common knowledge state variables.
The proposed method can be used to test the single equilibrium in the data assumption in this setting. One important form of unobserved heterogeneity that is worth taking into account when testing this assumption is consumers' preferences for banking services across markets. In fact, banks being more likely to install ATMs in a given market could either be due: (a) to consumers frequently visiting banking establishments; or (b) to the equilibrium in which banks are more likely to install ATMs being realized in this market.
Consider a simple version of the ATM location decision game (similar to an entry game). Let i = 1 be banks and i = 2 be nonbanks. A larger number of categories of players could be used as long as one is willing to assume that the conditional choice probability function is the same across players within a given category. Let y i = 1 if player i installs an ATM; y i = 0 otherwise. Suppose that, given x NE , there are two types of markets such that |V (x NE )| = 2. In markets with ν 1 , consumers are frequent users of banking services and, therefore, they often visit banking establishments. In markets with ν 0 , consumers use banking services less frequently. Since banks are more likely to install an ATM in markets with realization ν 1 , p 1 x NE ,ν 1 > p 1 x NE ,ν 0 . Notice that one could allow for up to 2 2 = 4 elements in V (x NE ) (provided that there is enough variation in the support X E ), but considering |V (x NE )| = 2 in this setting shows that even binary common knowledge unobserved heterogeneity may capture relevant variation in the data.
Since Gowrisankaran and Krainer (2011) consider a game of ATM installation taking potential ATM locations as given, one could use information about banks' and nonbanks' establishments within a market (e.g., X E could include two variables corresponding to the number of each players' potential locations) as predetermined outcomes that are valid proxies for this common knowledge unobserved heterogeneity. More precisely, banks' and nonbanks' establishments are the only potential locations for the game of interest. These predetermined outcomes have been decided prior to the game of interest and their effects on banks' and nonbanks' payoffs are the same regardless whether they decide to install ATMs at these locations. Moreover, since these potential ATM locations are taken as given, they cannot be modified in the ATM locations game. In that sense, they satisfy (15). In order to justify the redundancy of these predetermined outcomes, it remains to argue that potential locations do not affect equilibrium selection beyond X NE and V. In fact, it is unlikely that banks and nonbanks chose the location of their premises -in many cases, before the technology of ATMs was available -in order to signal they will install an ATM in a way that would affect equilibrium selection. Nonetheless, this proxy should reflect consumers' preferences for banking services: observing relatively more bank establishments in a market is potentially associated with consumers visiting banking establishments more frequently, which in itself should affect ATM location decisions. This correlation between the predetermined outcomes and preferences for banking services allows one to control for such common knowledge unobserved heterogeneity when testing the single equilibrium in the data assumption.
Predetermined outcomes as the ones described above also have the advantage of not requiring the researcher to know the source of unobserved heterogeneity one would like to control for. There is however one caveat: by definition, since predetermined outcomes are realized before the game of interest, they are more suitable to capture time-invariant unobserved heterogeneity. In many examples, including Example 4, time-invariant unobserved heterogeneity is indeed a relevant source of unobserved heterogeneity one would like to control for when testing the degenerate equilibrium selection assumption.

Identifying the Number of Components
The identification result in this paper holds for a given |V (x NE )|. It is worth noting that |V (x NE )| is actually identifiable. Let P (x NE ) be the (|X E | − 1) × (J + 1) N matrix with element (i,j) given by p y j x NE ,x i E − p y j x NE ,x 0 E , for i = 1, . . . , |X E | − 1 and j = 1, . . . , (J + 1) N , i.e., As it is stated in Lemma 1, for each x NE ∈ X NE , |V (x NE )| is identified through the rank of the matrix P (x NE ).
Since the elements in matrix P (x NE ) are constructed from point identified probabilities, its column rank is at most (J + 1) N − 1. Therefore, since the rank of a matrix is bounded by the minimum number of its rows and columns, the finite mixture representation restricts |V (x NE )| to be at most min (J + 1) N , |X E | , which corresponds to the cardinality of the support condition stated in Assumption 5. If players have symmetric choice probabilities, some of the (J + 1) N columns are necessarily identical, such that the column rank is at most J + 1 − 1 = J. This is the reason why the bound stated in Example 3 is min {J + 1, |X E |}.
An appealing by-product of Lemma 1 is that it can be used to test the hypothesis of the absence of payoff-relevant unobserved heterogeneity while remaining agnostic about the number of equilibria realized in the data. To the best of my knowledge, such a test is not available in the literature. In fact, from Lemma 1, |V (x NE )| = 1 if and only if rank { P (x NE )} = 0, i.e., all the elements of the matrix in (18) are equal to 0. In other words, testing that all the elements of this matrix are jointly equal to 0 amounts to testing the absence of payoff-relevant unobserved heterogeneity. The test proposed in the current paper is a test of the degenerate equilibrium selection assumption that allows for correlation between players' decisions due to common knowledge unobserved heterogeneity. In that sense, it requires |V (x NE )| ≥ 2. Nonetheless, if one cannot reject that all the elements in (18) are jointly equal to 0, then one may interpret this finding as evidence against the presence of (discrete) payoff-relevant unobserved heterogeneity and can apply other tests available in the literature that assume away such unobserved heterogeneity (e.g., de Paula and Tang, 2012).

Constructing the Identified Set
The identification result presented in this paper is conditional on x NE , such that all functions and statistics presented below depend on x NE . In order to alleviate notation, x NE is omitted as an argument.
The main intuition behind the identification argument combines results from Henry et al. (2014) and Kasahara and Shimotsu (2014). Let N ζ denote some subset of N such that N ζ = N ζ ≤ N. Given the identifying restrictions stated above, one can write the unknown probabilities p y ζ ν ∀y ζ ∈ Y N ζ , ∀N ζ ⊆ N , ∀ν ∈ V and the mixing weights γ ( ν| x E ) ∀x E ∈ X E , ∀ν ∈ V as functions of a vector of parameters θ to be defined below. Then, one can construct the identified set I such that θ belongs to I if and only if the data can be rationalized by the finite mixture over V corresponding to θ and the degenerate equilibrium selection, i.e., |T * (ν)| = 1 ∀ν ∈ V . For a given x NE , I = ∅ is therefore a testable implication of the single equilibrium in the data assumption. There are two types of restrictions defining the identified set.
1. The distribution of players' decisions can be written as a finite mixture over V, i.e., the unknown ν-specific probabilities and the mixing weights are proper probability mass functions. 2. The distributions of players' decisions satisfy the conditions implied by a single equilibrium being realized in the data, i.e., players' decisions are independent after conditioning on all common knowledge information.
While the first type of restrictions are similar to the ones provided in Henry et al. (2014), the second type of restrictions are specific to testing the degenerate equilibrium selection assumption and can be stated using an identification result from Kasahara and Shimotsu (2014).
Some notation must be introduced before stating formal results. First, the matrix P is defined as in (18). Let P be a (|V | − 1) × (|V | − 1) matrix obtained by deleting some rows and some columns of P. For subsetsX E and Y N satisfying Assumptions 5(v) and 5(vi), the matrix P is invertible. Let p r (x E ) be the row of P corresponding to x E after dropping the columns that are not in P . Similarly, let p c (y) be the column of P corresponding to y after dropping the rows that are not in P . Similar matrices and vectors P ζ , p ζ,r (x E ) and p ζ,c y ζ can be defined for any subset N ζ . Furthermore, denote While Henry et al. (2014) study the case where y is a scalar, their results directly extend to p y ζ ν ∀y ζ ∈ Y N ζ , ν ∈ V for any subset N ζ . Proposition 1 states that these probabilities and γ ( ν| x E ) ∀x E ∈ X E , ν ∈ V can be written as functions of |V |×(|V | − 1) parameters stored in the (|V | − 1)×1 vector φ and the (|V | − 1)× (|V | − 1) matrix ϒ defined as 18 and where, γ ν|x E ≡ γ ν|x E − γ ν| x 0 E . From Assumption 5(v), ϒ is invertible. The vector θ collects all the elements of φ and ϒ.
The expressions of the finite mixture probabilities in Proposition 1 are used to define the identified set I of all the values of φ,ϒ that rationalize the distribution of players' decisions via a finite mixture over V and a single equilibrium being realized in the data. The result is formally stated in Proposition 2. The following notation is used. Let N ζ n denote the subset of N containing exactly players i = 1, . . . ,n. Furthermore, let P ζ n ν l be the (J + 1) n × (J + 1) matrix of joint probabilities conditional on ν l over the n-dimensional vector containing the decisions of players 1,...,n -i.e., y ζ n ∈ y ζ,1 n ,...,y ζ,(J+1) n n ) -and player n + 1's decision -i.e., y n+1 ∈ y 1 n+1 ,...,y J+1 n+1 -such that Using Proposition 1, one can write the (i,j)th element of P ζ n ν l for l ∈ {0,..., |V | − 1} as  (25).
The conditions defining I in Proposition 2 are interpreted as follows. Parts (i) and (ii) guarantee that the probabilities defining the finite mixture over the common knowledge unobserved heterogeneity properly belong to the unit interval. These inequalities are the same as in Henry et al. (2014), Sects. 3 and 4.2). More precisely, part (i) is equivalent to γ ν l x E ≥ 0 for l = 1, . . . , |V | − 1 and |V |−1 l=1 γ ν l x E < 1 ∀x E ∈ X E . Furthermore, part (ii) states that 0 ≤ p ( y| ν) ≤ 1 ∀y ∈ Y N and ∀ν ∈ V . Part (iii) follows from the degenerate equilibrium selection assumption. The fact that this condition can be characterized via the rank of matrices P ζ n ν l being equal to 1 follows from Kasahara and Shimotsu (2014, Prop. 4). It is a consequence of the conditional independence of players' decisions when a single equilibrium is realized in the data. Several remarks are worth making about the identified set.
Remark 1 (Sharp identified set). The identified set I is sharp in the following sense. Given X NE , if the true data generating process corresponds to a single equilibrium being realized ∀ν ∈ V , then the value of θ corresponding to this true data generating process belongs to I and I cannot be empty. Conversely, each θ ∈ I corresponds to a finite mixture over V which rationalizes the joint distribution of players' decisions conditional on X NE through a single equilibrium at each realization of V. As a result, if I is not empty, the data can be rationalized by the degenerate equilibrium selection and the finite mixture unobserved heterogeneity assumptions.
Remark 2 (Equality restrictions). The conditions in part (iii) from Proposition 2 are equality restrictions and, therefore, drastically shrink the identified set. In fact, as it is stated in Corollary 1, they shrink the identified set to a singleton if |V | = 2 and the identified set is nonempty.
Remark 3 (Separately verifying finite mixture over unobserved heterogeneity). By the nature of its defining restrictions, one can check whether I being empty is due to a violation of the single equilibrium in the data assumption or the finite mixture representation of the unobserved heterogeneity. Here, the identifying restrictions in Assumption 5 are key. They allow one to identify p ( y| ν)'s, regardless of the number of equilibria realized at each ν ∈ V , using conditions (i) and (ii) in Proposition 2. In the Monte Carlo simulations presented in Section 6, the finite mixture restrictions are separately tested in order to assess what fraction of test rejections are actually due the finite mixture representation being rejected when it is true.
Remark 4 (Number of players and actions). A nice consequence of Proposition 2 is that increasing the number of players and/or the number of actions in the choice set does not affect the dimension of , which is entirely driven by |V |. In that sense, increasing N and/or J only adds more conditions that further restrict I .
Remark 5 (Separating the finite mixtures). Variation in X E restricts the set of θ satisfying the conditions implied by the mixture over V. However, variation in X E does not affect the conditions related to the single equilibrium in the data assumption. In fact, X E is excluded from the equilibrium choice probabilities and the equilibrium selection mechanism. In that sense, the proxy variables are key to separate the two layers of finite mixtures: the one over the multiple equilibria and the one over the unobserved heterogeneity.

The Special Case of a Mixture over Two Components
It is easy to represent graphically the identified set for |V | = 2 (for any N and J).
There are two parameters in θ , i.e., φ and ϒ, which are respectively given by the first elements of (20) and (21). For |V | = 2, Proposition 1 implies that ∀y ∈ Y N : p y| ν 1 = p y| x 0 E + therefore does not depend onỹ 1 . Consider the case where p y|x 1 E − p y| x 0 E > 0. Part (i) in Proposition 2 simplifies to 0 < γ (x E ) < 1 ∀x E ∈ X E . For a given x E , one can write, Moreover, part (ii) in Proposition 2 simplifies to 0 ≤ p y| ν l ≤ 1 ∀y ∈ Y N and l = 0,1. For a given y ∈ Y N , Therefore the upper bound on −φ/ϒ in (33) is satisfied whenever (31) holds. Similarly, the lower bound on (1 − φ) /ϒ in (34) is satisfied whenever (32) holds. Inequalities (31)-(34) can therefore be summarized as A similar argument for the case where p y|x 1 Since these inequalities must hold for all y ∈ Y N and all x E ∈ X E , the restrictions on θ coming from Proposition 2(i) and (ii) boil down to When |V | = 2, part (iii) from Proposition 2 is satisfied by at most one possible couple −φ/ϒ, (1 − φ) /ϒ. To see this, notice that for n = 1, P ζ n ν l is simply the matrix of the joint distribution of the decisions from players 1 and 2. A necessary condition for rank P ζ n ν l = 1 is that all minors of order 2 computed from P ζ n ν l must be equal to 0. As a result, using y 1 i ,y 2 i to denote two arbitrary actions from player i = 1,2, the following equation must hold for l = 0,1: p y 1 1 ,y 1 2 ν l p y 2 1 ,y 2 2 ν l − p y 1 1 ,y 2 2 ν l p y 2 1 ,y 1 2 ν l = 0. The discussion above is summarized in Figure 2. The shaded area corresponds to θ 's that satisfy conditions (i) and (ii) from Proposition 2. The couple As it is made obvious in Figure 2, part (iii) of Proposition 2 shrinks the identified set to a singleton. This result is formally stated in Corollary 1.
Proof. The proof directly follows from noting that there is at most one couple −φ/ϒ, (1 − φ) /ϒ that satisfies conditions (iii) in Proposition 2. If it exists, this couple defines a system of two equations in two unknowns which leads to at most one θ ∈ I .
Corollary 1 implies that if the data can be rationalized by a single equilibrium and a finite mixture over |V | = 2 values of ν, the corresponding θ is point identified (for any N and J). One way to interpret Corollary 1 is that, if one is willing to assume that the degenerate equilibrium selection assumption holds, players' decisions are independent given V and the finite mixture over V is point-identified from variation in the proxy for V when |V | = 2. This result holds for any N and J.
Remark 6 (Point identification and relation to the literature). The result in Corollary 1 leverages both the variation in the proxy for V and the conditional independence of Y 1 , ..., Y N given V. In that sense, the fact that it delivers point identification instead of set identification as in Henry et al. (2014) is not too surprising. Furthermore, applying the results from Hall and Zhou (2003) to the current setting would imply that point identification of the mixture over V should require N ≥ 3. However, Hall and Zhou (2003) only leverage the conditional independence of Y 1 , ..., Y N given V. The mixture being point-identified even if N = 2 in Corollary 1 is due to the identifying power of the proxy for V.
Remark 7 (Point identification with |V | > 2). The point identification result in Corollary 1 does not directly extend to |V | > 2. The key feature of the |V | = 2 case is that the conditions in part (iii) of Proposition 2 define equality constraints on the reparameterization −φ/ϒ, (1 − φ) /ϒ which can be used to construct a system of two equations that are linear in the two unknowns φ and ϒ. In the case |V | > 2, this reparameterization becomes [e l − φ] ϒ −1 for l ∈ {0,1, . . . , |V | − 1}. Even in cases where the conditions in part (iii) from Proposition 2 would be satisfied by a unique value of the reparameterized parameters, the resulting system of |V | × (|V | − 1) equations is not linear in the |V | × (|V | − 1) elements of θ . There is therefore no guarantee that a unique value of θ satisfies the conditions defining I .

Testable Implications
The null hypothesis H 0 : |T * (x,ν)| = 1,∀ν ∈ V (x) in (10) cannot be directly tested. In fact, it depends on ν ∈ V (x) which are unobservable. Nonetheless, the identification results derived above suggest that I being nonempty is a necessary condition of the single equilibrium in the data assumption in the presence of discrete common knowledge unobserved heterogeneity. More formally, under Assumptions 1 to 5, the null and alternative hypotheses in (10) lead to the following testable implications: H 0 : I = ∅;H 1 : I = ∅.
These testable implications are the ones that will be used to construct the statistical test described in this section. Of course, one drawback of considering testable implications is that, for some data generating processes,H 0 could be true even if H 0 is false. This limitation of the test is due to the fact that one does not observe the true p ( y| ν)'s, but one knows that they belong to a set. Some of the probabilities in this set may be rationalized through a single equilibrium even if the data are generated by multiple ones.
In most cases, however, this possible drawback is not a concern. Proposition 3 states that if the number of players and/or the number of actions in the choice set are large enough compared to |V |, testingH 0 is both necessary and sufficient for H 0 , except for a set of data generating processes that has Lebesgue measure zero within the space of choice probabilities p ( y| ν) ∀y ∈ Y N and ∀ν ∈ V . In that sense, the testable implications are said to be generically sufficient for the single equilibrium in the data assumption. 2 ≥ |V | (along with Assumptions 1-5) guarantees thatH 0 is generically sufficient for H 0 , it may not be necessary. In fact, the proof does not rule out that generic sufficiency could hold even if this condition is not satisfied. ≥ |V | implies thatH 0 is not generically sufficient for H 0 when N = 2 players choose between J + 1 = 2 possible actions, even if |V | is as small as 2. However, marginally increasing the number of players or the number of actions already allows for a larger number of points in the support V while ensuring thatH 0 is generically sufficient for H 0 . For games in which three players choose between two actions, the condition for generic sufficiency is satisfied with |V | ≤ 7. With two players and three actions, up to |V | ≤ 9 may be allowed for. In both cases, the restriction |V | ≤ min (J + 1) N , |X E | is still satisfied provided that there are sufficient points in |X E |.
It is also worth emphasizing that, even in cases where the condition ≥ |V | is not satisfied, a statistical test based onH 0 remains informative and is still of interest to applied researchers hoping to leverage multiplicity of equilibria to identify payoff functions. In fact, if one rejectsH 0 then one must reject H 0 . As a result, rejectingH 0 provides evidence of multiple equilibria being realized in the data that is robust to discrete common knowledge unobserved heterogeneity. In contrast, if one finds evidence of correlation between players' decisions using currently existing tests in the literature, it is not obvious whether this correlation is due to multiple equilibria or to common knowledge unobserved heterogeneity since the latter is assumed away.
Checking the testable implications of the single equilibrium in the data assumption boils down to a specification test in partial identification. 19 One can use the intersection bounds approach from Chernozhukov et al. (2013) to perform this test. 20 The details about the implementation of their procedure in the current setting are given in Appendix B. 19 The idea of checking testable implications of a given assumption has also been suggested in other contexts. See, for instance, Kitagawa (2015) for a test of instrument validity; Mourifié and Wan (2017) for local average treatment effect assumptions; Ghanem (2017) for identifying assumptions in nonseparable panel data models; Hsu, Liu, and Shi (2019) for generalized regression monotonicity; or Mourifié, Henry, and Méango (2020) for the Roy model. In many cases, the proposed test is implemented via a specification test, similarly as in the current setting. 20 A previously circulated version of the current paper used the inference method of Shi and Shum (2015) which worked well and generated simulation results qualitatively similar to the ones reported below. However, Shi and Shum's (2015) approach requires to have at least one equality condition defining the identified set. While this

Intersection Bounds Formulation
For a given x NE , the conditions defining I can be rewritten as inequalities that are greater or equal to zero. These inequalities depend on the joint probabilities of players' decisions ∀y ∈ Y N and ∀x E ∈ X E . Let p be the vector collecting all such probabilities 21 , and let p 0 be the true value of p in the population. For a given θ , collect the inequalities defining I in a L × 1 vector g p 0 ,θ ≡ g 1 p 0 ,θ , . . . ,g L p 0 ,θ and let L ≡ {1,2, . . . ,L}. From this notation, one can write the identified set as Testing θ ∈ I is equivalent to testing inf l∈L g l p 0 ,θ ≥ 0 against inf l∈L g l p 0 ,θ < 0. Chernozhukov et al. (2013) propose an estimate of the end point of a 1 − α onesided confidence interval for inf l∈L g l p 0 ,θ , denotedĝ (θ,α), such that whereĉ (θ,1 − α) is an estimated critical value andσ l (θ ) is the estimated standard error of g l p,θ . Details aboutĉ (θ,1 − α) andσ l (θ ) are in Appendix B. The vector p is the estimate of the population probabilities p 0 such that where is a block-diagonal matrix with x E -specific blocks (x E ) / Pr (X E = x E ). Chernozhukov et al. (2013) show that lim sup M→∞ Pr inf l∈L g l p 0 ,θ ≥ĝ (θ,α) ≤ α.
It follows that one rejects the hypothesis that θ ∈ I at the 1 − α confidence level ifĝ (θ,α) < 0. Testing the emptiness of I can be done by checking whether the confidence set, i.e., the collection of θ 's that belong to I with confidence level at least 1 − α, is empty. 22 Let the confidence set be defined as requirement is satisfied when testing (40) (i.e., testing necessary conditions for both the single equilibrium in the data assumption and the finite mixture representation of the unobserved heterogeneity), it is not met when focusing only on the conditions associated with the finite mixture representation (see Section 6.4). 21 In practice, one should drop the probabilities associated with a given realization y since the probabilities over all y ∈ Y N are linearly dependent. This transformation is not made explicit in the text to alleviate notation. 22 Bugni, Canay, and Shi (2015) refer to such a specification test as the by-product test. They propose alternative approaches that typically have relatively better power. However, their tests are designed for identified sets defined according to moment inequalities and may not directly apply to the current setting.
One can then consider the following non-randomized decision rule: Let E θ [·] denote the expectation under the data generating process corresponding to θ . In particular, for θ ∈ I , E θ [ξ M ] is the probability of rejectingH 0 : I = ∅ when it is true. Proposition 4 states that the statistical test of this null hypothesis based on the decision rule ξ M is asymptotically level α.
Proposition 4 (Asymptotic level of the test). Suppose that:

θ ) /∂p is uniformly Lipschitz;
(ii) the euclidean norm of each row of G (p,θ ) is bounded away from 0 uniformly in x NE and y; (iii) the eigenvalues of are bounded from above and away from 0; and (iv)ˆ is a consistent estimator of .

Computational Aspects
The conditions defining I are nonlinear in the elements of θ and, therefore, one would typically perform a grid search over the parameter space in order to check whether CS M (α) is empty. Of course, since the dimension of the parameter space increases in |V | -there are |V | × (|V | − 1) elements in θ -the computational burden of the test increases with the number of components in the mixture over the unobserved heterogeneity. The dimension of the parameter space that one must deal with in the current test is inherited from the partial identification of the finite mixture over the unobserved heterogeneity and is therefore exactly the same as in Henry et al. (2014).
There is however a nice feature of I that can be leveraged to considerably reduce the computational burden of the test: the equality constraints in part (iii) from Proposition 2. These equality constraints considerably shrink the identified set and can be used to choose starting values from the gridded parameter space. To see this, notice that the condition rank P ζ n ν l = 1 implies that all minors of order 2 constructed from P ζ n ν l must be equal to 0. Using (25), part (iii) from Proposition 2 defines quadratic expressions in reparameterizations of θ given by [e l − φ] ϒ −1 for l ∈ {0,1, . . . , |V | − 1} that must simultaneously be equal to 0. If |V | = 2, these equality restrictions define the roots of parabolas which can be easily computed in closed form. One can then use the values from the grid that are the closest to the roots as starting values in the grid search. If |V | = 3, the equality restrictions define conic sections and the (reparameterized) values of θ that belong to I are defined by the intersections of these conic sections. If |V | > 3 these values are at the intersections of quadric surfaces. While implementing algorithms to find the intersections of conic sections or quadric surfaces may be challenging (see e.g., Chan, 2006), the ability to initiate the grid search at values that are close to conic sections or quadric surfaces may considerably reduce the computational burden.
It is also worth describing how many inequalities must typically be checked to see if θ belongs to CS M (α). 23 Parts (i) and (ii) in Proposition 2 are written in terms of convex hulls and ranges of functions as in Henry et al. (2014). Alternatively, part (i) can be written as min x E ∈X E γ ν l x E ≥ 0 for l = 1, . . . , |V | − 1 and γ ν l x E < 1, leading to |V | conditions to check. Part (ii) can also be written as min y∈Y N p ( y| ν) ≥ 0 and y∈Y N p ( y| ν) = 1 ∀ν ∈ V . Notice, however, that the latter equality is always satisfied given the expression for p ( y| ν) in Proposition 1 since y∈Y N p ( y| x E ) = 1. Part (ii) therefore leads to |V | inequalities. Finally, equalities in part (iii) of Proposition 2, more precisely the conditions that all minors of order 2 must be equal to 0, lead to two inequalities: both the minimum value of all minors of order 2 and the minimum value of the negative of all minors of order 2 must be greater than or equal to 0. Overall, there are therefore 2 × |V | + 2 inequalities to check to see if θ belongs to CS M (α).

MONTE CARLO SIMULATIONS
This section provides simulation evidence in order to investigate the statistical size and power of the test for the single equilibrium in the data assumption based on the testable implications. More precisely, the size of the test is the probability of rejectingH 0 (i.e., ξ M = 1) when H 0 is true (i.e., there is a single equilibrium in the data for all values of ν). The power of the test is the probability of rejectingH 0 (i.e., ξ M = 1) when H 0 is false (i.e., there are multiple equilibria in the data for some value(s) of ν). The robustness of the test under some levels of misspecification is also studied.

Data Generating Processes
The data generating processes are based on the simple game of market entry introduced in Example 1. The N = 2 version of the model as described above, as well as a N = 3 extension are considered. Different data generating processes are created by varying equilibrium selection mechanisms and realizations of V. The analysis of the size and the power of the test is based on a total of 13 different cases which vary according to the number of players, the number of equilibria realized in the data, which equilibria are realized and the equilibrium selection mechanisms. Six other cases are included to study whether the test is robust to some potential misspecifications. In particular, the simulations investigate the effect of misspecifying the number of components in the finite mixture over V and incorrectly discretizing the support of X E . The 19 different data generating processes are summarized in Table 1. When N = 2, let firms' payoffs be When N = 3, firms' payoffs are given by Similarly as before, let p i ( 1| ν) denote the probability that y i = 1, given x NE (dropped to alleviate notation) and ν. Then, for N = 2, the BNE in pure strategies indexed by τ is such that For N = 3, it becomes For most data generating processes, x NE ≡ x NE,1 ,x NE,2 ,x NE,3 = [1,1,1] and ν l = ν l 1 ,ν l 2 ,ν l 3 for l = 0,1 such that ν 0 = [1,0.5,0.5] , ν 1 = [1.25,1,1.25] . When studying robustness of the test, some cases add one more realization ν 2 = [0.5,0.25] to the data generating process described above with N = 2. Table 2 summarizes the multiple equilibria of the model. For N = 2, this model admits three solutions for each ν. When N = 3, there are five solutions for each realization ν.
When studying the size and the power of the test, there are four different realizations of x E . For each of these realizations, the weights associated with ν 1 are listed in Table 3 where γ (x E ) ≡ γ ν 1 x E . Since γ x 0 E = 0.05 and γ x 1 E = 0.95, Size S1 True True  Note: p (ν,τ ) is the vector containing each player's probability of choosing y i = 1 given x NE (dropped to alleviate notation) and ν in equilibrium τ . Table 3. Mixture weights for unobserved heterogeneity it follows that φ 0 = 0.05 and ϒ 0 = 0.90. These weights will be modified for some of the robustness analysis provided below. For a given Monte Carlo sample, the joint conditional choice probabilities in p 0 are estimated using a simple frequency count estimator. Since estimated choice probabilities equal to 0 may lead to ill-defined test statistics, 0's are replaced with 10 −6 and the vector of choice probabilities given x E is normalized to sum to 1. Moreover, since the setsX E and Y N are arbitrary, letx 0 E = x 0 E ,x 1 E = x 1 E , and let y 1 be the value of y ∈ Y N that maximizesp y|x 1 E −p y|x 0 E . This is done to avoidp ỹ 1 x 1 E −p ỹ 1 x 0 E = 0 which would cause the procedure to fail. In order to compute ξ M , one must span the parameter space , at least until CS M (α) is found to be nonempty. Simulations below use the grid {0.005,0.01, . . . ,0.99,0.995} for both φ and ϒ. As described in Section 5.3, equality restrictions from Proposition 2 part (iii) are used to choose starting values on the grid. If these starting values are not in CS M (α), the method considers each possible value of θ in this discretized parameter space (subject to φ + ϒ < 1) and computesĝ (θ,α) as in (42) where the details about the construction of the critical values and the standard errors are in Appendix B. When computing the critical values, R = 500 draws are used. The results report different sample sizes for each value of x E denoted by M (x E ) ∈ {50,250,500,1,000}. In most cases, let |X E | = 4 (such that M ∈ {200,1,000,2,000,4,000}). The only exceptions are cases R3-R6 in which the incorrectly discretized support of X E is such that |X E | = 3 (and M ∈ {150,750,1,500,3,000}). The rejection probabilities of the test, for α = 0.1 and α = 0.05, are computed using 250 Monte Carlo samples.

Size of the Test
Cases S1-S5 are such that the data are generated from a single equilibrium at ν 0 and ν 1 . For these cases, I = ∅. The probabilities of rejectingH 0 are reported in the first part of Table 4. A few comments are worth making. First, cases S1-S4 confirm that the test defined by ξ M in (46) is asymptotically level α, as expected from Proposition 4. However, the test is somewhat conservative. This observation is in fact a common feature of specification tests in partial identification that are based on checking whether the confidence set is empty. In the context of identified sets defined according to moment inequalities, Andrews and Guggenberger (2009) and Andrews and Soares (2010) have already pointed out that such tests may have asymptotic size strictly smaller than α. Second, the main difference between case S5 and the previous ones is that the former is a game between N = 3 players instead of 2. Since there are 2 N − 1 probabilities to be estimated for each x E , the case with N = 3 requires greater sample sizes for the asymptotic properties to apply. Notice that the rejection probabilities when M (x E ) = 50 are much smaller than the ones obtained for larger sample sizes. However, this observation seems to be due to several estimated probabilities inp being zeros (therefore set to 10 −6 before normalizing the sum to 1) in small samples. The results for M (x E ) = 50 should therefore be interpreted with caution. The test does over reject in case S5 even at M (x E ) = 1,000, but the probability of rejecting typically decreases as sample size increases.
Testing whether I is empty amounts to checking the testable implications associated with both the single equilibrium in the data assumption and the finite mixture representation of the unobserved heterogeneity. One may therefore wonder if rejectingH 0 is due to rejecting the single equilibrium in the data assumption or the finite and discrete unobserved heterogeneity assumptions. An appealing feature of the proposed test is that it allows the researcher to separately check whether the restrictions due to the finite mixture representation are satisfied even if one rejects I = ∅.
The second half of Table 4 reports the probability of rejecting the finite mixture assumptions, which are satisfied in all data generating processes. These probabilities are computed over the same Monte Carlo samples as when checking all restrictions. The only difference is thatĝ (θ,α) is now constructed using only the restrictions associated with the finite mixture over V. The results confirm that, except in a few cases, a non negligible fraction of the rejections ofH 0 when H 0 is true are due to the testable implications of the single equilibrium assumption being violated.

Power of the Test
Cases P1-P8 are based on the same data generating processes as for cases S1-S5, but using different equilibrium selection mechanisms. All cases except P7 mix two equilibria for both ν 0 and ν 1 . Case P7 is such that there are two equilibria realized in the data for ν 0 , but a single one for ν 1 . All eight cases violate the single equilibrium in the data assumption and are such that H 0 is false. For all cases except P6,H 0 is also false. In fact, for games between two players choosing one of two actions with two components in the mixture over V, Proposition 3 implies that the set of data generating processes for whichH 0 is true even if H 0 is false does not have Lebesgue measure zero. Case P6 is included as an example of such a data generating process.
While the test has power approaching 1 in large samples for most data generating processes reported in the table, the power is actually quite low in cases P5 and P6 even for large sample sizes. The power issue associated with P6 is as expected: the data can be rationalized by a single equilibrium (H 0 is true) even if the data are generated from multiple equilibria (H 0 is false). In this case, one expects the test to rejectH 0 with probability at most α asymptotically since the data are still rationalizable by a single equilibrium.
The rejection probabilities associated with case P5 suggest that the test may have much better power for some alternative equilibrium selection mechanisms than others. Further investigation of the properties of this data generating process suggest that the low power associated with P5 is due to the fact that, even if I = ∅, the data are "almost" rationalizable by a single equilibrium not too far from the true equilibrium choice probabilities. The intuition can be understood via Figure 2. The singleton satisfying conditions (iii) in Proposition 2 falls outside of, but very close to the shaded area defined by conditions (i) and (ii). It is therefore not surprising to find that the test has low power in this case.
Similarly as when analyzing the size of the test, the second panel of Table 5 is used to assess whether the test successfully rejectsH 0 based on a violation   of the single equilibrium in the data restrictions or the failure to rationalize the data via a finite mixture over the unobserved heterogeneity. The finite mixture restrictions hold in the data generating processes and these restrictions should be seldom rejected in large samples. The results show that the test is successful at separating the two types of conditions. In large samples, the test almost always rejects the testable implications of the single equilibrium in the data assumption, but fails to reject the testable implications of the finite mixture representation.

Robustness
Finally, because the proposed test is based on a specification test, it is worth exploring the approach's sensitivity to some potential misspecifications. In particular, simulations reported below provide evidence that I is not de facto empty as soon as some of the assumptions maintained about the data generating process are not exactly satisfied. The following experiences explore some level of misspecification with respect to the support of the unobserved heterogeneity and the variable satisfying the exclusion restriction. So far, the number of components in the finite mixture over the unobserved heterogeneity, i.e., |V |, has been assumed to be known. 24 Cases R1 and R2 investigate the effect of incorrectly assuming |V | = 2 when |V | = 3 in the data generating process (with mixture weights in Table 6). While in R1, the data are generated from a single equilibrium for each ν, there are two equilibria for each ν in R2.
The rejection probabilities computed via Monte Carlo simulations for cases R1 and R2 are reported in the corresponding rows of Table 7. This type of misspecification does not alter the properties of the test. In R1, even if |V | = 3 in the data generating process, I = ∅ which suggests that the data are still rationalizable by a single equilibrium and a finite mixture counting two components. In that case, the misspecified test still achieves asymptotic size control. In R2, where there are multiple equilibria in the data, the data cannot be rationalized by a single equilibrium and a finite mixture over two components of unobserved heterogeneity, i.e., I = ∅. Once again, the power of the misspecified test tends to 1 as the sample size increases.
Interestingly, for both R1 and R2, one rarely rejects the testable restrictions associated with |V | = 2 even if the true mixture is such that |V | = 3, at least in large samples. In other words, the data are still rationalizable by a finite mixture over two components even if there are three components in the generating process. However, it is easy to see that the current test would not be robust to incorrectly assuming |V | = 2 when in fact |V | = 1. Asymptotically, p ( y| x E ) would not vary with x E which implies that Q (x E ), L 0 (y) and L 1 (y) are not defined and I = ∅ regardless of the number of equilibria realized in the data.
Cases R3-R6 evaluate the effect of incorrectly discretizing X E . 25 This experiment is meant to address the concern that some variables used as proxy for the unobserved heterogeneity may be continuous in nature and discretized by the applied researcher. In order to assess the effect of an incorrect discretization of the support of such proxy variables in a simple way, suppose that some values of x E are pooled together when estimating p 0 . Cases R3 and R5 pool the true x 2 E and x 3 E ; R4 and R6 pool the true x 0 E and x 1 E . In all cases, the test is performed after 24 As mentioned in Lemma 1, this number is point-identified from the rank of a matrix. Although one could perform a rank test to check whether the maintained number of components is rejected by the sample, the current procedure is not augmented with such a pre-testing approach in order to avoid issues related with testing multiple null hypotheses. 25 While properly discretizing X NE is also an important decision to be made, such potential misspecification is not included in the simulations. Pooling together values of x NE associated with significantly different equilibria would likely be problematic not only for a test of the single equilibrium in the data assumption, but also for the estimation of the primitives of the model.  having estimated the joint distributions conditional on the remaining three values. The data generating processes for R3 and R4 are exactly the same as in S1 (i.e., both H 0 andH 0 are true). For R5 and R6, the data are generated as in P1 (i.e., both H 0 andH 0 are false). The rejection probabilities are also reported in Table 7. While incorrectly pooling x 2 E and x 3 E does not affect much the size and the power of the test, the wrong discretization that fails to separate x 0 E and x 1 E can lead to important changes in the rejection probabilities. In fact, the rejection probabilities for R3 and R5 are very similar to the ones in their properly discretized counterparts (respectively S1 and P1). However, this similarity does not hold for R4 and R6. This important distinction across the alternative discretizations may be due to the fact that the differences between γ (x E )'s is much larger for x 0 E vs. x 1 E than for x 2 E vs. x 3 E . Finally, in cases R1, R4, and R6, the procedure fails in some rare instances (at most three out of 250 Monte Carlo samples, almost exclusively when M (x E ) = 50). Such failures were not encountered in cases S1-S5 nor P1-P8.

CONCLUDING REMARKS
To sum up, the test of the single equilibrium in the data assumption presented in the current paper addresses two important issues associated with the procedures previously proposed in the literature. First, it allows for common knowledge payoff-relevant unobservables, which is an extra source of plausible correlation between players' decisions beyond multiple equilibria being realized in the data. Second, the proposed test does not require the estimation of payoff functions to separate the problems of multiple equilibria and common knowledge unobserved heterogeneity. The latter feature of the test is useful for empirical researchers interested in testing degenerate equilibrium selection in hope of leveraging multiple equilibria as a source of variation to identify payoffs when commonly used exclusion restrictions are not available. Moreover, no parametric assumption is needed for the payoff functions nor the distributions of the unobservables, besides the finite mixture representation (which implies restrictions that are separately testable). The main identifying assumption is the existence of an observable variable which can be interpreted as a proxy variable for the common knowledge unobserved heterogeneity. The testable implications are generically sufficient for the single equilibrium in the data assumption under some verifiable conditions. The test boils down to a specification test and it can be implemented using the intersection bounds framework of Chernozhukov et al. (2013) which has nice properties corroborated in simulations.

A.1. Proof of Lemma 1
While Henry et al. (2014) only consider mixtures of marginal distributions, their results also apply to mixtures of joint distributions. Notice that the (i,j)th element of P (x NE ) can be written as

(A.2)
In other words, the matrix P (x NE ) can be written as the product of two matrices. The first one is a ( The second one is a (|V (x NE )| − 1)×(J + 1) N matrix with (i,j)th element p y j x NE ,ν i − p y j x NE ,ν 0 . By Assumption 5(vi), the second matrix is full row rank. The rank of the product of the two matrices is therefore equal to the rank of the first matrix, which is |V (x NE )| − 1 by Assumption 5(v).

A.2. Proof of Proposition 1
For any y ζ ∈ Y N ζ , N ζ ⊆ N , x E ∈ X E , one can write p y ζ x E = p y ζ ν 0 + |V |−1 l=1 p y ζ ν l − p y ζ ν 0 γ ν l x E . First, consider the expression of p y ζ ν l . Some more notation is needed. Let δ y ζ be the (|V | − 1) × 1 vector with lth element p y ζ ν l − p y ζ ν 0 , and let ϒ c x E be the column of ϒ corresponding tox E . Evaluating (A.3) at anyx E ∈X E and subtracting (A.4) from it, one can write such that the y ζ -specific column of P ζ is p ζ,c y ζ = ϒ δ y ζ . Since ϒ is invertible, it follows that δ y ζ = ϒ −1 p ζ,c y ζ . Rearranging (A.4), p y ζ ν 0 can be written as p y ζ ν 0 = p y ζ x 0 E − φ ϒ −1 p ζ,c y ζ .

(A.7)
It follows that p y| ν l for any l = 0,1,..., |V | − 1 is written as p y ζ ν l = p y ζ x 0 E + [e l − φ] ϒ −1 p ζ,c y ζ (A.8) which is the expression in Proposition 1. Second, consider the expression for γ (x E ). Since γ (x E ) does not depend on y, the expression can be derived for any subset N ζ ⊆ N . Without loss of generality, consider N . Using (A.8), the (|V | − 1) × (J + 1) N matrix with (i,j)th element p y j ν i − p y j ν 0 can be written as ϒ −1 P . Therefore, subtracting (A.4) from (A.3) for all y ∈ Y N , the x E -specific row of P is p r (x E ) = [γ (x E ) − φ] ϒ −1 P . Since ϒ and P are both invertible, it follows that which is the expression in Proposition 1.
Part (iii) is an application of Kasahara and Shimotsu (2014, Prop. 4). It states that there is only one component in the finite mixture over τ and, therefore, players' decisions are independent given x NE and ν. In fact, since players' decisions are conditionally independent, the matrix P ζ n (ν) can be written as the product of two rank-1 matrices: the column vector collecting p y ζ n ν ∀y ζ n ∈ Y n and the row vector collecting p y n+1 ν ∀y n+1 ∈ Y .
As a result, rank P ζ n (ν) = 1. Following Dawid (1979, p. 5), one can define the joint independence of N random variables inductively. In the current setting, players' decisions are independent if player 2's decisions are independent from those of player 1, player 3's decisions are independent from those of players 1 and 2, etc. This is the reason why, for a given ν, it suffices to consider the ranks of matrices P ζ n (ν) for n = 1,...,N − 1. By writing the elements of P ζ n (ν) as in (25), this condition defines restrictions on φ,ϒ in terms of point identified probabilities.

A.4. Proof of Proposition 3
Necessity simply follows from the definition of I . Let θ 0 be the value of θ that corresponds to the true data generating process with choice probabilities denoted p 0 ( y| ν) ∀y ∈ Y N and ∀ν ∈ V . If T * (x,ν) = 1,∀ν ∈ V (x) is true, then θ 0 ∈ I and, as a result, I = ∅ is true.
To show that I = ∅ being true is generically sufficient for T * (x,ν) = 1,∀ν ∈ V (x), one can proceed as follows. First, for some arbitraryθ = θ 0 one can write the corresponding choice probabilities denotedp y ζ ν ∀y ζ ∈ Y N ζ , ∀N ζ ⊆ N , ∀ν ∈ V as functions ofθ , θ 0 and the true choice probabilities. Then, if I = ∅ fails to be sufficient for T * (x,ν) = 1 ∀ν ∈ V (x), it must be the case that ∃θ that belongs to I when θ 0 does not. To show generic sufficiency, one can therefore show that the set of true data generating processes for which suchθ exists can be characterized via some algebraic conditions and therefore has Lebesgue measure zero within the space of choice probabilities p ( y| ν) ∀y ∈ Y N and ∀ν ∈ V . Somewhat similar algebraic arguments, although applied to a different context, can be found in Allman, Mathias, and Rhodes (2009).
In order to writep y ζ ν as functions ofθ , θ 0 and the true choice probabilities, evaluate (22) in Proposition 1 atθ to get p y ζ ν l = p y ζ x 0 E + e l −φ Ῡ −1 p ζ,c y ζ , (A.10) where p y ζ x 0 E and p ζ,c y ζ are functions of θ 0 and the true p 0 y ζ ν for ν ∈ V . In fact, one can write p y ζ x 0 E = p 0 y ζ ν 0 + φ 0 δ 0 y ζ , (A.11) p ζ,c (y) = ϒ 0 δ 0 y ζ , (A.12) where, using notation similar as in the proof of Proposition 1, δ 0 y ζ is the (|V | − 1) × 1 vector with lth element p 0 y ζ ν l − p 0 y ζ ν 0 . It follows that p y ζ ν l = p 0 y ζ ν 0 + φ 0 + ϒ 0Ῡ −1 e l −φ δ 0 y ζ . Forθ to belong to the identified set I , it must satisfy rank P ζ n ν l = 1, ∀n ∈ {1,...,N − 1} and ∀l ∈ {0,...,|V | − 1}. In particular, it must be the case that all minors of order 2 ofP 2 minors of order 2. Forθ to belong to I , all the quadratic expressions corresponding to these minors must be equal to 0 when evaluated atθ. A necessary (but not sufficient) condition forθ ∈ I is that the quadratic expressions must have a common solution, which defines restrictions on the true data generating process. To show that these restrictions are satisfied by a set of data generating processes that has Lebesgue measure zero within the space of choice probabilities, one can use the properties of multivariate resultants (see for instance Cox, Little, and O'Shea, 2005, Chap. 3). Since the quadratic expressions are not homogenous, one can homogenize them by multiplying some terms of the quadratic expressions with properly defined powers of an additional variable. This is without loss of generality since the resultant of the homogenized polynomials is the same as the resultant of their non-homogenized couterparts (e.g., Cox et al., 2005, p. 81). Since φ 0 + ϒ 0Ῡ −1 e l −φ is a vector counting |V | − 1 elements, there are |V | unknowns in the homogenized polynomials. Pick |V | homogenous polynomials constructed from |V | minors of order 2, which is possible provided that N−1 n=1 n(J+1) 2 J+1 2 ≥ |V |. By Cox et al. (2005), Thm. 2.3), these |V | homogenous polynomials in |V | unknowns have a common nontrivial solution if and only if the resultant of the |V | homogenous polynomials is equal to zero. The resultant is a uniquely defined polynomial in the coefficients of the homogenous polynomials and is therefore a polynomial in the choice probabilities p 0 ( y| ν) ∀y ∈ Y N and ∀ν ∈ V corresponding to the true data generating process. One can construct such a resultant for different subsets of |V | minors of order 2. The set of true choice probabilities for which all such resultants are simultaneously equal to 0 defines an algebraic variety (Allman et al., 2009, p. 3105). This variety is proper since any data generating process for which rank P ζ n ν l = 1, for some n ∈ {1,...,N − 1} and/or some l ∈ {0,...,|V | − 1} does not lie in it. It follows that the set of true choice probabilities for which all such resultants are simultaneously equal to 0 is of dimension smaller than the size of the space of choice probabilities and hence the set of data generating processes for whichθ ∈ I necessary has Lebesgue measure zero within the space of choice probabilities p ( y| ν) ∀y ∈ Y N and ∀ν ∈ V .

A.5. Proof of Proposition 4
Since both Y and X have finite and discrete support, p 0 can be estimated as the vector of sample averages of properly defined dummy variables. Sincep is the only source of randomness in g p,θ , one can use the estimation and inference results from Chernozhukov et al. (2013)  When θ ∈ I , inf l∈L g l p 0 ,θ ≥ 0, such that Pr 0 >ĝ (θ,α) ≤ Pr inf l∈L g l p 0 ,θ ≥ g (θ,α) . By the definition of CS M (α) in (45)

APPENDIX B. IMPLEMENTATION OF THE INTERSECTION BOUNDS APPROACH
In the case with N players, J + 1 actions and |V | = 2 components in the mixture over the unobserved heterogeneity, the inequalities to be checked can be written as follows. The functions Q (x E ), L 0 (y) and L 1 (y) are defined in (30). The conditions in Proposition 2(i) can be rewritten as Similarly, the conditions in Proposition 2(ii) can be rewritten as and setĝ (θ,α) = inf l∈L g l p,θ +ĉ (θ,1 − α)σ l (θ) .
To reduce the computational burden of the grid search, one should leverage the fact that the conditions in Proposition 2(iii) drastically shrink the identified set. The simulations therefore use as starting values of θ the values on the grid which are closest to the ones that solve (B.6) and (B.10).