Status-impact assessment: is accuracy linked with status motivations?

Status hierarchies are ubiquitous across cultures and have been over deep time. Position in hierarchies shows important links with fitness outcomes. Consequently, humans should possess psychological adaptations for navigating the adaptive challenges posed by living in hierarchically organised groups. One hypothesised adaptation functions to assess, track, and store the status impacts of different acts, characteristics and events in order to guide hierarchy navigation. Although this status-impact assessment system is expected to be universal, there are several ways in which differences in assessment accuracy could arise. This variation may link to broader individual difference constructs. In a preregistered study with samples from India (N = 815) and the USA (N = 822), we sought to examine how individual differences in the accuracy of status-impact assessments covary with status motivations and personality. In both countries, greater overall status-impact assessment accuracy was associated with higher status motivations, as well as higher standing on two broad personality constructs: Honesty-Humility and Conscientiousness. These findings help map broad personality constructs onto variation in the functioning of specific cognitive mechanisms and contribute to an evolutionary understanding of individual differences.

. Associations between individual difference characteristics and indices of accuracy. The partial correlation estimates and 95% confidence intervals were computed from the model-estimated t-statistics and degrees of freedom. Note that differential accuracy associations are based on the interactions between the individual difference characteristics and peerassessed status impacts, and the elevation accuracy associations are based on the main effects. H = Honesty-Humility; E = Emotionality; X = eXtraversion; A = Agreeableness; C = Conscientiousness; O = Openness. Status Motivation and the HEXACO traits are grand-mean centered and standardized, Female is an effect coded variable where -1 = Male, 1 = Female. Age is grandmean centered but not standardized.
The small but statistically significant interaction between status motivations and peer-assessed status impacts (p = .0004), suggests that participants who scored higher on our measure of status motivations tended to have steeper slopes (i.e., better differential accuracy) than those who scored lower on status motivations. Although the main effect estimate for status motivations is positive, it was not statistically significant (p = .172), so we do not have strong support a relationship between status motivations and participant intercepts (i.e., elevation accuracy).
Differential accuracy (i.e., variation in participant slopes) was further predicted by age (p = .028) and Honesty-Humility (p = .016), such that older participants and those who scored higher on Honesty- Humility tended to have slightly better differential accuracy. We did not find evidence of reliable associations between differential accuracy and participant sex (p = .060) or other HEXACO traits (ps > .478), and the point estimates for the later are centered around zero. Elevation accuracy (i.e., variation in participant intercepts) was statistically significantly associated with participant sex (p = .027) and Honesty-Humility (p = .002), but not other individual difference characteristics (ps > .053). Given that a participant with perfect elevation accuracy would have an intercept of zero, interpretation of the statistically significant elevation associations is aided by Figure 2, where the model-estimated participant intercepts are plotted as a function of sex and Honesty-Humility. Male participants tended to over-estimate status impacts in relation to their peers, while female participants tended to under-estimate ( Figure 2A). The negative association between participant intercepts and Honesty-Humility appears to be driven by low-scoring participants tending to overestimate status-impacts, high-scorers tending to under-estimate, and participants scoring closer to average levels of Honesty-Humility tending to be relatively more accurate ( Figure 2B). Figure 2. Depictions of participant intercepts (i.e., elevation accuracy) as a function of self-reported sex (Panel A), or scores on the Honesty-Humility scale (Panel B). Importantly, the statistical tests of these trends were based on associations with participants' latent intercepts, not the extracted estimates of their intercepts depicted here. The black squares on Panel B indicate the mean intercepts for each sex. We note that participants' self-assessments of the status impacts of the 150 personal characteristics are very strongly positively associated with peer-assessments of the status impacts of those personal characteristics (b = .80, r = .99, p < .001). The estimated intercept was not statistically different from zero (p = .881), suggesting that the average person's status impact estimate will be zero when their peer's assessment is zero. The variance components reveal that there is generally little variation in slopes (i.e., differential accuracy; σ = .01) and intercepts (i.e., elevation accuracy; σ = .01) across participants. Together, these results suggest that agreement about status impacts between a given individual and their peers can be expected to be quite high. Additionally, the variance components reveal that latent slopes and intercepts are weakly negatively correlated (r = -.13), indicating that they are capturing largely distinct aspects of accuracy in the case of status-impact assessment as we have measured it. To investigate whether there may be a sex difference in overall status motivations, we conducted a Welch two-sample t-test. The mean score of status motivation for men was 5.16 (SD = 1.10) and the mean for women was 5.27 (SD = .74). The difference between these means was not statistically significant, t = 0.91, df = 166.55, p = .367, 95% CI [-.13, .36], d = .14. Thus, we cannot reject the hypothesis that the difference in status motivations between men and women is equal to zero. Finally, Figure 3 shows all the correlations between the individual difference constructs we assessed in the current study. Of the 21 pairwise correlation between variables included in the current study, four were statistically significant after implementing a Holm correction for multiple tests. Status motivations were positively associated with Extraversion (r = .25, p < .001); Honesty-Humility was positively associated with both Conscientiousness (r = .24, p < .001) and Agreeableness (r = .27, p < .001); and Emotionality was negatively correlated with Extraversion (r = -.24, p < .001).

Registered Report Power Analysis
For the registered report, we based our target sample size on a bootstrap power analysis of our pilot data (c.f., Kleinman & Huang, 2016;Strong & Alvarez, 2019). While, we do not know the true effects, the bootstrapped power analysis provides insight into relationships between item and participant sample size settings and statistical power for the range of effect sizes that we observed in the pilot data, which are the best guess for expected effects in the replication samples. We iteratively bootstrapped 500 datasets based on different combinations of sample sizes for the items (ns = 20, 30, 40, 50) and participants (ns = 300, 400, 500, 600, 700, 800), which reflect different potential settings for the design of the proposed studies. We ran the main multilevel model with all focal interactions and controls on each bootstrapped data set and computed the mean effect size estimate and power (i.e., the percentage of statistically significant results) across the different setting combinations. Figure S1 shows the relationships between power and the average effect size (partial r calculated from the t-statistics and degrees of freedom) across bootstrapped analyses for each sample size setting. The effect size results suggest that analyses based on 40 or 50 items tended to converge on similar effect estimates, whereas the estimated effects in analyses based on fewer items were comparably smaller (suggesting they are truncated in small samples of items); thus, we will present a random subset of 40 items to participants. Further, the power results suggest that we should have at least 80% power to detect effects larger than r = .1 with 40 items and between 400-800 participants. We also conducted a separate power analysis for the planned t-tests of status motivations using the pwr.t.test function in the pwr package (Champely et al., 2018), which suggested that we would need 393 participants per group (i.e., men and women) to be 80% powered to detect a small (d = .2) sex difference. Based on these two power analyses we set our target sample size at 800 participants (400 women and 400 women) per country. Comparing the typical way of computing accuracy components to the multilevel model estimation of these components To make sure that the parameters from the multilevel model correspond to appropriately to the indices of differential and elevation accuracy that are typically used, we looked at correlations between the indices derived from the two approaches. We computed differential accuracy as (a) the correlation between each person's self-assessed status impacts and their peer's assessments, and (b) the participants' random slope estimates extracted from the multilevel models where self-assessments and peer-assessments are standardized within-participants. We computed elevation accuracy as (a) the mean of the differences between each participant's self-assessments and the assessments of their peers for each item, and (b) the participants' random intercept estimates extracted from the multilevel models.
The minimum, maximum, mean and standard deviation of the accuracy indices are shown in the Table  S1 below and the correlations among the different ways of computing accuracy are shown in the Figure  S2. The strong correlations suggest in both countries that the multilevel models are correctly estimating the accuracy parameters that have been used in previous research. Note, however, that the MLM approach appropriately incorporates measurement error when estimating associations between the accuracy components and other individual difference constructs.   Table S2 shows the alpha reliabilities and other relevant information for the status motive and HEXACO scales. Reliability is high for the status motive measures, but generally low for the HEXACO measures. As noted in the main text, this low reliability for the HEXACO measure is to be expected given that each subscale is comprised of very few items and the items themselves are intended to cover the full breadth of the trait construct rather than a narrow piece of the trait space (de Vreis, 2013). Still, the reliability of some scales in India was especially poor (e.g., Agreeableness, Emotionality). However, the correlations between the individual differences were largely similar in each country, suggesting that they are tapping similar constructs in both countries even though the reliability is low. Still, caution is warranted in interpreting associations involving these measures, because differences in measurement error between the scales can drive differences in the results (Westfall & Yarkoni, 2016). Same results with another measure of status motivation Figure S3 shows that the results of the focal model are essentially identical using an alternative measure-the Need for Status Scale developed by Flynn et al. (2006)-of status motivation that we included in the study to assess whether the results are dependent on the status motivation measure. As reported in the main text, there were no statistical differences in the results between the measures (see R code for full output tables).