Hostname: page-component-76fb5796d-2lccl Total loading time: 0 Render date: 2024-04-29T09:27:57.131Z Has data issue: false hasContentIssue false

Interpretational issues with the bifactor model: a commentary on ‘Defining the p-Factor: An Empirical Test of Five Leading Theories’ by Southward, Cheavens, and Coccaro

Published online by Cambridge University Press:  11 April 2023

Conor V. Dolan*
Affiliation:
Netherlands Twin Register, Department of Biological Psychology, Vrije Universiteit, Van der Boechorststraat 7-9, 1081 BT, Amsterdam, The Netherlands
Denny Borsboom
Affiliation:
Department of Psychology, Faculty of Behavioral and Social Sciences, University of Amsterdam, Nieuwe Achtergracht 129-B, 1018WS Amsterdam, The Netherlands
*
Author for correspondence: Conor V. Dolan, E-mail: c.v.dolan@vu.nl
Rights & Permissions [Opens in a new window]

Abstract

Southward, Cheavens, and Coccaro (2022, Psychological Medicine) conducted an ambitious investigation aimed at determining the nature of the general p factor of psychopathology by considering the correlation between the p factor and five candidate constructs. Generally, in this area of research, the bifactor model is preferred to the second order common factor model. In this commentary, we identify several interpretational issues concerning the bifactor model, which are based on a realistic psychometric view of latent variables. These issues may hamper the study of the nature of p factor model using the bifactor model.

Type
Invited Commentary
Copyright
Copyright © The Author(s), 2023. Published by Cambridge University Press

Southward, Cheavens, and Coccaro (Reference Southward, Cheavens and Coccaro2022) have conducted an ambitious investigation aimed at determining the nature of the general p factor of psychopathology by considering the correlation between the p factor and five candidate constructs. The main result of Southward et al. (Reference Southward, Cheavens and Coccaro2022) is the correlation matrices of the p factor and the five candidate constructs in two versions of bifactor models (Caspi et al., Reference Caspi, Houts, Belsky, Goldman-Mellor, Harrington, Israel and Moffitt2014; Lahey et al., Reference Lahey, Applegate, Hakes, Zald, Hariri and Rathouz2012; Lahey, Moore, Kaczkurki, & Zald, Reference Lahey, Moore, Kaczkurki and Zald2021), one with correlated and one with uncorrelated residual group factors (see their online Supplementary Tables S4b and S4c). The reported correlations, which may be interpreted as ‘validity coefficients’, are useful as they may help to determine the meaning of the p factor. While useful in some contexts, in our view the bifactor model poses problems of interpretation that require more attention in the literature. The aim of the present commentary is to discuss these by comparing the bifactor model and the second-order factor model. The outline of this commentary is as follows. We first present in conceptual terms the bifactor model and the second order factor model. We illustrate these using path diagrams in Figs 1 and 2. The path diagrams do not present the full models; we consider only four indicators of a given common factor, as this is sufficient given our present aims. Following the presentation of these models, we discuss two main issues: the issue of unidimensionality and the analyses of external dependent and independent covariates. We limit this commentary to indicators which are test items or individual symptoms. Sometimes subtests are used as indicators, but this is not relevant to the issues that we discuss. The Figures include parameters to which we refer in our commentary. We consistently mentioned the relevant Figure in referring to the parameters in the model.

Fig. 1. Left: the bifactor model, with the general first order factor p 1 and the specific group factor s k. Only one s k is depicted (in practice there are three or more factors s k). The observed symptoms or items, i 1 to i 4, are regressed on p 1 and s k. The variables ε are residuals in the regression of the i 1 to i 4 on f k. Right: The observed x and y are external predictor and dependent variables, respectively. The variable ζ p is the residual in the regression of p 1 on x, and ζ sk is the residual in the regression of s k on x. The variable ζ y is the residual in the regression of y on s k and p 1.

Fig. 2. Left: The second order factor model, with second order factor p 2, and a single first order common factor f k (in practice there are three or more factors f k). The observed symptoms or items, i 1 to i 4, are indicators of the latent variable f k. The variables ε are residuals in the regression of the indictors in the factor f k, and the variable ζ f is the residual in the regression of f k on p 2. Right: The observed x and y are external predictor and dependent variables, respectively. The variable ζ p is the residual in the regression of p 2 on x, ζ y is the residual in the regression of y on p 2 and f k, and ζ f is the residual in the regression of f k on x and p 2.

Bifactor model and second order factor model

In the bifactor model, the p factor (denoted p 1, see Fig. 1) features as the general factor on which all observed symptoms (or items) load directly (see Fig. 1 below). As such, the factor p 1 accounts for variance that is common to all symptoms, and provides a partial account of the symptom correlations. This is partial, because the sets of symptoms relating to dimensions of psychopathology (e.g. the set consisting of symptoms of depression) are expected to display residual correlations. Specific group factors (denoted s k, k = 1,2,…,K, Fig. 1) are included in the model to account for the residual correlations. In this model, the p-factor p 1 is uncorrelated with the specific group factors s k, but the specific group factors s k may be correlated (see Southward et al., Reference Southward, Cheavens and Coccaro2022).

The bifactor model represents one model among a range of candidate models that can be used to analyze the covariance structure of mental health problems (Borsboom et al., Reference Borsboom, Deserno, Rhemtulla, Epskamp, Fried, McNally and Waldorp2021; Chen, West, & Sousa, Reference Chen, West and Sousa2006; Markon, Reference Markon2019; Yung, Thissen, & McLeod, Reference Yung, Thissen and McLeod1999). A well-known alternative model is the second-order factor model, in which the p factor features as a second-order factor (denoted p 2; see Fig. 2). In this model, the symptoms load on correlated first (or lower) order factors (denoted f k, k = 1,2,…,K, Fig. 2), which in turn load on the factor p 2. So, while the observed symptoms are related directly to p 1 in the bifactor model, in the second-order model, the first order factors f k mediate the relationship between p 2 and the observed symptoms. Thus, in this model the p-factor p 2 accounts for the correlations among the factors f k, and, as such, may represent a common cause of the common factors f k.

Many articles have been devoted to the interpretative merits of the bifactor model, specifically with respect to regression modeling of a dependent external variable. The merits lie in the fact that the decomposition of the dependent variable variance is simple, because the factors s k and the factor p 1 are uncorrelated (as mentioned, the specific factors s k may be correlated, as is the case in online Supplementary Table S4c in Southward et al., Reference Southward, Cheavens and Coccaro2022). In the second-order factor model, the dependent variance decomposition is complicated by the fact that (1) the factors f k and p 2 are correlated, and (2) given K + 1 predictors, one regression coefficient has to be fixed (to zero) to achieve model identification. Exactly which coefficient is fixed is generally arbitrary. The same applies with respect to external predictors of the common factors: all common factor in the bifactor model may be regressed on a given predictor, but in the second-order factor model, one regression coefficient has to be fixed.

While the advantage of the bifactor model in regression modeling may be relevant in a predictive context, it does not necessarily translate to an advantage in an explanatory context. In the latter context, the goal is to develop a model that represents the p-factor optimally from a theoretical and interpretative point of view. The emphasis on advantage of the bifactor model, coupled with inconsequential (van Bork, Epskamp, Rhemtulla, Borsboom, & van der Maas, Reference van Bork, Epskamp, Rhemtulla, Borsboom and van der Maas2017) or potentially misleading (Greene et al., Reference Greene, Eaton, Li, Forbes, Krueger, Markon, Waldman and Kotov2019) differences in model fit of the bifactor model and the second-order factor model, has overshadowed the psychometric and interpretative problems of the bifactor model (see also Achenbach, Reference Achenbach2021; Pettersson, Larsson, & Lichtenstein, Reference Pettersson, Larsson and Lichtenstein2021). As mentioned above, we discuss two issues. First, the bifactor model stipulates that symptoms or items associated with a given latent variable (e.g. depression) are bi-dimensional. This inconsistent with the unidimensionality of measurement models, i.e. an important psychometric criterion, which is highly relevant to the interpretation of psychometric test scores (we return to this below). Second, the specification of uncorrelated s k and p 1 rules out the investigation of role of the p factor as a mediator of the relationship between external dependent and independent covariates and dimensions of psychopathology. This hinders the study of the role of the p factor as the general cause of psychopathology.

Unidimensionality v. bi-dimensionality

We adopt the view of symptoms as reflective indicators, which are directly and causally dependent on the latent variable (Borsboom, Mellenbergh, & van Heerden, Reference Borsboom, Mellenbergh and van Heerden2003). Given a single well defined dimension of psychopathology, the set of reflective indicators (i.e. symptoms) making up the psychometric test, designed to measure the dimension, should satisfy unidimensionality. Unidimensionality, which is typically established during the development of the test, e.g. by means of common factor modeling, is important for the interpretation of the test scores as proxies of the dimension of psychopathology, as represented by the common factor f k. So, we assume that common factor f k represents an interpretable dimension of psychopathology, which is measured using a unidimensional set of reflective indicators. We emphasize that we do not adhere to the position that psychometric theory is universally normative, in the sense that it requires all constructs to be unidimensional. Whether a construct is better represented in a unidimensional or multidimensional model is a substantive question that will depend on the topic studied. Our point of departure is rather that, if a construct in interpreted as being the common determinant of a set of item responses or subtest scores, unidimensionality is generally accepted as an important criterion. Given this point of departure, an interpretative problem of the bifactor model is that the indicator sets are specified to load on two uncorrelated common factors s k and p 1. The implied bi-dimensionality is inconsistent with the unidimensionality of the indicators, as indicators of f k. This poses a problem of interpretation for both the factors s k and the factor p 1. The factors s k are often interpreted in terms of the indicators on which they load, i.e. the s k are interpreted simply as f k. This is problematic for two reasons. First, the common factors f k and s k cannot reference the same latent variables, given the presence of the factor p 1, which is uncorrelated with the specific factors s k in the bifactor model. Second, because the nature of p 1 is unknown, the nature of s k, the residual common factor in the presence of p 1, is necessarily unknown as well. In addition, modeling the p-factor as a general first order factor, which accounts for symptom correlations, is problematic, because the explanandum is the correlations among the dimensions of psychopathology. Thus, the p-factor does not explain the correlations between the latent variables that potentially underlie mental disorders, but leaves them unmodeled, and thus essentially treats them as a nuisance. Note that the correlations among the symptoms themselves do not require explanation. The symptoms making up a given test are correlated because they depend on (indicate) the dimension (represented by f k) that the test was designed to measure. The second order factor model avoids these problems. First, while the nature of the p factor (p 2) unknown, the first order factors (f k) retain their unidimensionality and their interpretation, and thus are amenable to the standard psychometric analysis. Second, in contrast to the bifactor model, the p 2 factor actually does fulfill a productive explanatory role in the second order factor model, which is consistent with the issue at hand, because it provides an account of the correlations among the distinct and well defined dimensions of psychopathology.

Regression modeling

As mentioned, the interpretational advantage of the second order factor model comes with the limitation with respect to regression modeling of a predictor x and dependent variable y. To ensure identification of the regression model, in which we regress an external dependent variable y on the factors or in which we regress the common factors on an external predictor x (see Fig. 2), requires us to fix one of the K + 1 regression coefficients. Furthermore, the y variance decomposition in the second order factor model is considered hard to interpret in the second order factor model, because the decomposition is not orthogonal due to the correlation between the factors f k and p 2. The latter limitation is an issue, only if the cause of the correlation between the two predictors is unknown. In that case, the explained variance includes a component that cannot be attributed unambiguously to either predictor, because it depends in part on this correlation. However, in the second order factor model, the cause of the correlation between the predictors f k and p 2 is represented explicitly by the parameter a in the regression of f k on p 2 (Fig. 2), and therefore the variance decomposition does not pose any problem of interpretation. Specifically, in terms of Fig. 2, the explained y variance is due to the direct effect of p 2 on y (path byp), the indirect effect of p 2 on y (path a*byf), and to ζ f, the first order factor residual ζ f (path gfζ*byf). Note that this model allows us to investigate explicitly the role of the p factor (p 2) as the mediator of the relationship between f k and the dependent y (i.e. the test of byf = 0). In the bifactor model, the dependent variable variance is explained by the paths cys and cyp (Fig. 1). This model precludes a statement concerning the contribution of role of f k to the explained variance, and cannot address the issue of mediation, where the prediction of y by f k may be mediated by p 2 (i.e. f k ← p 2 → y).

The same reasoning applies with respect to the predictor x (Fig. 2). Again, the correlation between f k and p 2 stemming from the parameter a, does not hinder the interpretation of components of explained variance. The explained variance of p 2 is due to the path gpx. The explained variance in f k is decomposed into two interpretable parts: one due to the path gpx*a, i.e. the part involving p 2 as a mediator, and a part due to gfx. From the point of view of the standard mediation model (e.g. Baron & Kenny, Reference Baron and Kenny1986; Maxwell & Cole, Reference Maxwell and Cole2007) demonstrating that the relations between the predictor x and the factors f k is fully mediated by p 2 (i.e. the test of gfx = 0) would likely advance our understanding of the p factor. For instance, genetic pleiotropy is well established in studies of psychopathology (see Grotzinger et al., Reference Grotzinger, Mallard, Akingbuwa, Ip, Adams, Lewis, McIntosh and Nivard2022; Mallard et al., Reference Mallard, Karlsson Linnér, Grotzinger, Sanchez-Roige, Seidlitz, Okbay and Paige Harden2022). It would be of interest to determine the extent to which the genetic correlations among the factors f k are attributable to genetic effects that are mediated by p 2. In the bifactor model, the mediation hypothesis, as outlined above in the second order factor model, does not apply, because it does not include the common factors f k. Similarly, mediation among the predictor x, the factor p 1 and the specific factors s k does not apply, because the factor p 1 and the specific factors s k are uncorrelated. One can regress p 1 and s k on the external predictor X, and view the insignificance of the latter (s k on x, i.e. parameter txs = 0 in Fig. 1) as evidence of mediation by p 1 of the relation between the predictor x and the items. However, this is a different mediation hypothesis, which involves the items, not the common factors f k. With respect to external predictor x, we note that the bifactor model appears to be restrictive, as it implicitly rules out the possibility that the external predictor x is associated with both s k and p 1. Specifically, a predictor common to s k and p 1 (i.e. tsx ≠ 0 and tpx ≠ 0 in Fig. 1) would give rise to a correlation between s k and p 1 (due to tsx*tpx). This would seem to violate the assumption in the bifactor model that s k and p 1 are uncorrelated.

Conclusion

In conclusion, the bifactor model poses problems of interpretation, because (1) it implies bi-dimensionality of the indicator set, which does not sit well with the psychometric and substantive ideal of unidimensionality; (2) it operationalizes the p factor as a direct source of correlation among items, while its explanandum is arguably the correlations among the factors f k; (3) the common factors s k are hard to interpret, and cannot be interpreted as factors f k; (4) it does not allow for tests of the mediatory role of the p factor vis-à-vis the common factor f k in the modeling of predictors (x) or dependent variables (y).

We emphasize that our commentary is based on the realistic interpretation of the first order factors (f k) in the second order factor model (Borsboom et al., Reference Borsboom, Mellenbergh and van Heerden2003). This has considerable explanatory import: the model encodes the scientific hypothesis that a general liability to develop psychopathology makes one more liable to any specific form of psychopathology, and therefore explains why the first order factors (f k) are correlated. While this is a strong hypothesis that may very well be incorrect, it does represent a theory worthy of further investigation; a theory, moreover, that is consistent with what we would take the natural reading of the p-factor as a general liability. As such, it would be infelicitous if a modeling choice for the bifactor model, which is based on pragmatic concerns regarding prediction and arguably inconsequential differences in model fit, were to be adopted as the default model without further deliberation. We therefore think that the choice between the bifactor model and the second order model deserves more attention than it currently enjoys.

References

Achenbach, T. M. (2021). Hierarchical dimensional models of psychopathology: Yes, but…. World Psychiatry, 20(1), 6465. https://doi.org/10.1002/wps.20810.CrossRefGoogle Scholar
Baron, R. M., & Kenny, D. A. (1986). The moderator-mediator variable distinction in social psychological research: Conceptual, strategic, and statistical considerations”. Journal of Personality and Social Psychology, 51(6), 11731182. doi:10.1037/0022-3514.51.6.1173.CrossRefGoogle ScholarPubMed
Borsboom, D., Deserno, M. K., Rhemtulla, M., Epskamp, S., Fried, E. I., McNally, R. J., … Waldorp, L. J. (2021). Network analysis of multivariate data in psychological science. Nature Reviews Methods Primers, 1(1), 58. https://doi.org/10.1038/s43586-021-00055-w.CrossRefGoogle Scholar
Borsboom, D., Mellenbergh, G., & van Heerden, J. (2003). The theoretical status of latent variables. Psycholological Review, 110(2), 203219. doi:10.1037/0033-295X.110.2.203.CrossRefGoogle ScholarPubMed
Caspi, A., Houts, R. M., Belsky, D. W., Goldman-Mellor, S. J., Harrington, H., Israel, S., … Moffitt, T. E. (2014). The p factor: One general psychopathology factor in the structure of psychiatric disorders? Clinical Psychological Science, 2(2), 119137. doi:10.1177/2167702613497473.CrossRefGoogle Scholar
Chen, F. F., West, S. G., & Sousa, K. H. (2006). A comparison of bifactor and second-order models of quality of life. Multivariate Behavioral Research, 41(2), 189225.CrossRefGoogle ScholarPubMed
Greene, A. L., Eaton, N. R., Li, K., Forbes, M. K., Krueger, R. F., Markon, K. E., Waldman, I. D., … Kotov, R. (2019). Are fit indices used to test psychopathology structure biased? A simulation study. Journal of Abnormal Psycholology, 128(7), 740764. doi:10.1037/abn0000434.CrossRefGoogle ScholarPubMed
Grotzinger, A. D., Mallard, T. T., Akingbuwa, W. A., Ip, H. F., Adams, M. J., Lewis, C. M., McIntosh, A, …Nivard, M. G. (2022). Genetic architecture of 11 major psychiatric disorders at biobehavioral, functional genomic and molecular genetic levels of analysis. Nature Genetics, 54(5), 548559. https://doi.org/10.1038/s41588-022-01057-4.CrossRefGoogle ScholarPubMed
Lahey, B. B., Applegate, B., Hakes, J. K., Zald, D. H., Hariri, A. R., & Rathouz, P. J. (2012). Is there a general factor of prevalent psychopathology during adulthood? Journal of Abnormal Psychology, 121(4), 971977. doi:10.1037/a0028355.CrossRefGoogle Scholar
Lahey, B. B., Moore, T. M., Kaczkurki, A. N., & Zald, D. H. (2021). Hierarchical models of psychopathology: Empirical support, implications, and remaining issues. World Psychiatry, 20(1), 5763. doi:10.1002/wps.20824.CrossRefGoogle ScholarPubMed
Mallard, T. T., Karlsson Linnér, R., Grotzinger, A. D., Sanchez-Roige, S., Seidlitz, J., Okbay, A., …Paige Harden, K. (2022). Multivariate GWAS of psychiatric disorders and their cardinal symptoms reveal two dimensions of cross-cutting genetic liabilities. Cell Genomics, 2(6), 100140. doi:10.1016/j.xgen.2022.100140.CrossRefGoogle ScholarPubMed
Markon, K. E. (2019). Bifactor and hierarchical models: Specification, inference, and interpretation. Annual Review of Clinical Psychology, 15(1), 5169. https://doi.org/10.1146/:annurev-clinpsy-050718-095522.CrossRefGoogle ScholarPubMed
Maxwell, S. E., & Cole, D. A. (2007). Bias in cross-sectional analyses of longitudinal mediation. Psychological Methods, 12(1), 2344. doi:10.1037/1082-989X.12.1.23.CrossRefGoogle ScholarPubMed
Pettersson, E., Larsson, H., & Lichtenstein, P. (2021). Psychometrics, interpretation and clinical implications of hierarchical models of psychopathology. World Psychiatry, 20(1), 6869. doi:10.1002/wps.20813.CrossRefGoogle ScholarPubMed
Southward, M. W., Cheavens, J. S., & Coccaro, E. F. (2022). Defining the p-factor: An empirical test of five leading theories. Psychol Med., 112. doi:10.1017/S0033291722001635. PMID: 35711145.Google ScholarPubMed
van Bork, R., Epskamp, S., Rhemtulla, M., Borsboom, D., & van der Maas, H. L. J. (2017). What is the p-factor of psychopathology? Some risks of general factor modeling. Theory & Psychology, 27(6), 759773. https://doi.org/10.1177/0959354317737185.CrossRefGoogle Scholar
Yung, Y. F., Thissen, D., & McLeod, L. D. (1999). On the relationship between the higher-order factor model and the hierarchical factor model. Psychometrika, 64, 113128. doi:10.1007/BF02294531.CrossRefGoogle Scholar
Figure 0

Fig. 1. Left: the bifactor model, with the general first order factor p1 and the specific group factor sk. Only one sk is depicted (in practice there are three or more factors sk). The observed symptoms or items, i1 to i4, are regressed on p1 and sk. The variables ε are residuals in the regression of the i1 to i4 on fk. Right: The observed x and y are external predictor and dependent variables, respectively. The variable ζp is the residual in the regression of p1 on x, and ζsk is the residual in the regression of sk on x. The variable ζy is the residual in the regression of y on sk and p1.

Figure 1

Fig. 2. Left: The second order factor model, with second order factor p2, and a single first order common factor fk (in practice there are three or more factors fk). The observed symptoms or items, i1 to i4, are indicators of the latent variable fk. The variables ε are residuals in the regression of the indictors in the factor fk, and the variable ζf is the residual in the regression of fk on p2. Right: The observed x and y are external predictor and dependent variables, respectively. The variable ζp is the residual in the regression of p2 on x, ζy is the residual in the regression of y on p2 and fk, and ζf is the residual in the regression of fk on x and p2.