Hostname: page-component-77f85d65b8-8v9h9 Total loading time: 0 Render date: 2026-03-28T12:05:41.477Z Has data issue: false hasContentIssue false

A statistical test of independence in choice data with small samples

Published online by Cambridge University Press:  01 January 2023

Michael H. Birnbaum*
Affiliation:
Department of Psychology, California State University, Fullerton, H-830M, Box 6846, Fullerton, CA 92834–6846, USA
Rights & Permissions [Opens in a new window]

Abstract

This paper develops tests of independence and stationarity in choice data collected with small samples. The method builds on the approach of Smith and Batchelder (2008). The technique is intended to distinguish cases where a person is systematically changing “true” preferences (from one group of trials to another) from cases in which a person is following a random preference mixture model with independently and identically distributed sampling in each trial. Preference reversals are counted between all pairs of repetitions. The variance of these preference reversals between all pairs of repetitions is then calculated. The distribution of this statistic is simulated by a Monte Carlo procedure in which the data are randomly permuted and the statistic is recalculated in each simulated sample. A second test computes the correlation between the mean number of preference reversals and the difference between replicates, which is also simulated by Monte Carlo. Data of Regenwetter, Dana, and Davis-Stober (2011) are reanalyzed by this method. Eight of 18 subjects showed significant deviations from the independence assumptions by one or both of these tests, which is significantly more than expected by chance.

Information

Type
Research Article
Creative Commons
Creative Common License - CCCreative Common License - BY
The authors license this article under the terms of the Creative Commons Attribution 3.0 License.
Copyright
Copyright © The Authors [2012] This is an Open Access article, distributed under the terms of the Creative Commons Attribution license (http://creativecommons.org/licenses/by/3.0/), which permits unrestricted re-use, distribution, and reproduction in any medium, provided the original work is properly cited.
Figure 0

Table 1: Analyses of actual data of Regenwetter (2011) replicating Tversky (1969). The mean of z is the mean number of disagreements (preference reversals) out of 10 choices between two rows, averaged over all possible pairs of rows. The variance of z is the variance of these entries for the original data. The pv -values are the proportion of simulated permutations of the data for which the calculated variance of z in the permuted sample is greater than or equal that of the original data (each is based on 100,000 computer generated, pseudo-random permutations of the data). Correlations between mean rate of preference reversals and difference between repetitions are shown in the column labeled, r. Corresponding pr -values are shown in the last column.

Figure 1

Table 2: Raw data for Subject #2 of Regenwetter, et al. (2011), who showed violations of iid on both indices. Each row shows results for one repetition of the study.

Figure 2

Figure 1: Estimated distribution of variance of the entries in z matrices generated from random permutations of the original data matrix for case #2 (Table 2), based on 100,000 pseudo-random permutations of the data in Table 2. None of the simulations exceeded the value observed in the original data, 4.29. Case #2 was selected as showing the most systematic evidence against the iid assumptions.

Figure 3

Table A.1. A hypothetical set of data with two choices and 20 repetitions.

Figure 4

Table A.2. Crosstabulation of the hypothetical data from Table A.1

Figure 5

Table A.3. Results of Monte Carlo simulation of hypothetical data with 20 reps. Cells a, b, c, and d represent the frequencies of (0, 0), (0, 1), (1, 0), and (1, 1), respectively. The column labeled “Fisher” shows the calculated probability for the Fisher exact test of independence; The last column shows the simulated pv, based on 10,000 random permutations.

Figure 6

Table A.4. Hypothetical data consistent with transitivity and with the iid assumptions of Regenwetter, et al. (2010, 2011). These data are coded such that 1 = preference for the first stimulus in each choice and 0 = preference for the second stimulus in each choice.

Figure 7

Table A.5. Hypothetical data consistent with transitivity, but not with iid assumptions of Regenwetter, et al. (2010). In this case, the subject started with the transitive order, ABCDE for six blocks of trials, then switched to the opposite order for the last four blocks of trials.

Figure 8

Table A.6. Hypothetical data that violate both transitivity and the iid assumptions of Regenwetter, et al. (2010, 2011). In this case, the person used an intransitive lexicographic semiorder for four blocks, followed by two transitive blocks of trials, followed by an opposite intransitive pattern for four blocks.

Figure 9

Table A.7. Results of Monte Carlo simulations for Hypothetical Data with Three Variables. The hypothetical frequencies of response combinations, total n, and p-values given three methods. The last column shows Monte Carlo results based on 10,000 simulations.

Supplementary material: File

Birnbaum supplementary material

Birnbaum supplementary material
Download Birnbaum supplementary material(File)
File 3.4 KB
Supplementary material: File

Birnbaum supplementary material

Birnbaum supplementary material

Download Birnbaum supplementary material(File)
File 1.5 KB