Hostname: page-component-77f85d65b8-8v9h9 Total loading time: 0 Render date: 2026-03-30T02:52:41.878Z Has data issue: false hasContentIssue false

Post hoc score category collapsing for L2 pronunciation research

Published online by Cambridge University Press:  11 February 2026

Taichi Yamashita*
Affiliation:
Faculty of Foreign Language Studies, Kansai University, Osaka, Japan
Rights & Permissions [Opens in a new window]

Abstract

Second language (L2) pronunciation research has measured speech comprehensibility by asking listeners to assess L2 learners’ speaking performance with rating scales. While some studies have provided validity evidence for these rating scales, few studies have examined the extent to which those scales effectively distinguish among L2 speakers. To fill this gap, the present study examines the 9-point scale used in Saito et al. (2020: Annual Review of Applied Linguistics, 40, 9–25.) and the 100-point scale in Huensch and Nagle (2023: Studies in Second Language Acquisition, 45(2), 571–585.) from a Rasch measurement perspective and showcases post hoc score category collapsing as a potential countermeasure against suboptimal rating scale functioning. Findings suggested that different score categories represented the same ability level and were therefore interchangeable. Collapsing these score categories yielded shorter but more functional scales without compromising the psychometric qualities of the original scales. These findings suggest that researchers need to empirically refine their scale lengths rather than uncritically following their conventional measurement practices.

Information

Type
Methods Forum
Creative Commons
Creative Common License - CCCreative Common License - BYCreative Common License - NCCreative Common License - SA
This is an Open Access article, distributed under the terms of the Creative Commons Attribution-NonCommercial-ShareAlike licence (http://creativecommons.org/licenses/by-nc-sa/4.0), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the same Creative Commons licence is used to distribute the re-used or adapted article and the original article is properly cited. The written permission of Cambridge University Press or the rights holder(s) must be obtained prior to any commercial use.
Open Practices
Open materials
Copyright
© The Author(s), 2026. Published by Cambridge University Press
Figure 0

Figure 1. Collapsing schemes tested for the 9-point scale in Saito et al. (2020).

Figure 1

Figure 2. Collapsing schemes for the 100-point scale in Huensch and Nagle (2023).

Figure 2

Table 1. Score category statistics of the original 9-point scale

Figure 3

Figure 3. Probability curves of the original 9-point scale visualizing the probabilities of score categories (y-axis) as a function of examinee ability (x-axis).

Figure 4

Table 2. Score category statistics of the four collapsed scales

Figure 5

Figure 4. Probability curves for the 3-point (111222333, top left), 5-point (112234455, top right), 4-point (112223344, bottom left), and 4-point scales (112233344, bottom right).

Figure 6

Figure 5. Variable maps for the original 9-point scale and the collapsed 3-point, 4-point (112223344), and 4-point (112233344) scales.

Figure 7

Table 3. Comparison among the original 9-point, 3-point (111222333), 4-point (112223344), and 4-point (112233344) scales

Figure 8

Table 4. Score category statistics of the 10 score categories associated with the highest threshold distances on the 100-point scale

Figure 9

Table 5. Score category statistics of the 10-point scale

Figure 10

Table 6. Score category statistics of the 5-point scale

Figure 11

Table 7. Score category statistics of the 4-point scale

Figure 12

Figure 6. Probability curves of the 100-point (top left), 10-point (top right), 5-point (bottom left), and 4-point (bottom right) scales.8

Figure 13

Figure 7. Variable maps of the original 100-point scale (left) and the collapsed 4-point scale (right).

Figure 14

Table 8. Comparison between the original 100-point scale and the collapsed 4-point scale