TESTING FOR HOMOGENEITY IN MIXTURE MODELS

Jiaying Gu; Roger Koenker; Stanislav Volgushev

doi:10.1017/S0266466617000299

TESTING FOR HOMOGENEITY IN MIXTURE MODELS

Published online by Cambridge University Press: 24 July 2017

Jiaying Gu ,

Roger Koenker and

Stanislav Volgushev

Show author details

Jiaying Gu*: Affiliation:
University of Toronto
Roger Koenker: Affiliation:
University of Illinois
Stanislav Volgushev: Affiliation:
University of Toronto
*: *Address correspondence to Jiaying Gu, Department of Economics, University of Toronto, Toronto, Canada; e-mail: jiaying.gu@utoronto.ca.

Article contents

Abstract
Footnotes
References

Get access

Rights & Permissions

Abstract

Statistical models of unobserved heterogeneity are typically formalized as mixtures of simple parametric models and interest naturally focuses on testing for homogeneity versus general mixture alternatives. Many tests of this type can be interpreted as C(α) tests, as in Neyman (1959), and shown to be locally asymptotically optimal. These C(α) tests will be contrasted with a new approach to likelihood ratio testing for general mixture models. The latter tests are based on estimation of general nonparametric mixing distribution with the Kiefer and Wolfowitz (1956) maximum likelihood estimator. Recent developments in convex optimization have dramatically improved upon earlier EM methods for computation of these estimators, and recent results on the large sample behavior of likelihood ratios involving such estimators yield a tractable form of asymptotic inference. Improvement in computation efficiency also facilitates the use of a bootstrap method to determine critical values that are shown to work better than the asymptotic critical values in finite samples. Consistency of the bootstrap procedure is also formally established. We compare performance of the two approaches identifying circumstances in which each is preferred.

Type: ARTICLES
Information: Econometric Theory , Volume 34 , Issue 4 , August 2018 , pp. 850 - 895

DOI: https://doi.org/10.1017/S0266466617000299 [Opens in a new window]
Copyright: Copyright © Cambridge University Press 2017

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Footnotes

This research was partially supported by NSF grant SES-11-53548 and Project C1 of the SFB 823 of the German Research Foundation. Part of this research was conducted while the first author was visiting the Mathematics department at Ruhr University Bochum and the third author was a visiting scholar at UIUC. They are very grateful to the UIUC Statistics and Economics departments and the Bochum Mathematics department for their hospitality. The third author also gratefully acknowledges Financial support from the DFG (grant VO1799/1-1). The authors would also like to express their appreciation to the Editor, the Co-Editor and the referees for comments that led to improvements in the article.

References

REFERENCES

Andersen, E.D. (2010) The MOSEK Optimization Tools Manual, Version 6.0. Available at http://www.mosek.com.Google Scholar

Azaïs, J.-M., Gassiat, É., & Mercadier, C. (2006) Asymptotic distribution and local power of the log-likelihood ratio test for mixtures: Bounded and unbounded cases. Bernoulli 12(5), 775–799.Google Scholar

Azaïs, J.-M., Gassiat, É., & Mercadier, C. (2009) The likelihood ratio test for general mixture models with or without structural parameter. ESAIM: Probability and Statistics 13, 301–327.Google Scholar

Bickel, P. & Chernoff, H. (1993) Asymptotic distribution of the likelihood ratio statistic in a prototypical nonregular problem. In Ghosh, J., Mitra, S., Parthasarathy, K., & PrakasaRao, B. (eds.), Statistics and Probability: A Raghu Raj Bahadur Festschrift, pp. 83–96. Wiley.Google Scholar

Böhning, D., Schlattmann, P., & Lindsay, B. (1992) Computer-assisted analysis of mixtures (C.A.MAM): Statistical algorithms. Biometrics 48, 283–303.Google Scholar

Bücher, A., Dette, H., & Volgushev, S. (2011) New estimators of the Pickands dependence function and a test for extreme-value dependence. The Annals of Statistics 39(4), 1963–2006.Google Scholar

Bühler, W. & Puri, P. (1966) On optimal asymptotic tests of composite hypotheses with several constraints. Probability Theory and Related Fields 5, 71–88.Google Scholar

Chen, H. & Chen, J. (2001) Large sample distribution of the likelihood ratio test for normal mixtures. Canadian Journal of Statistics 29, 201–216.CrossRef Google Scholar

Chen, H., Chen, J., & Kalbfleisch, J. (2001) A modified likelihood ratio test for homogeneity in finite mixture models. Journal of the Royal Statistical Society: B 63, 19–29.Google Scholar

Chen, J. (1995) Optimal rate of convergence for finite mixture models. The Annals of Statistics 23, 221–233.Google Scholar

Chen, J. & Li, P. (2009) Hypothesis test for normal mixture models. Annals of Statistics 37, 2523–2542.Google Scholar

Chen, J., Li, P., & Liu, Y. (2016) Sample-size calculation for tests of homogeneity. Canadian Journal of Statistics 44, 82–101.Google Scholar

Chen, X., Ponomareva, M., & Tamer, E. (2014) Likelihood inference in some finite mixture models. Journal of Econometrics 182, 87–99.Google Scholar

Chesher, A. (1984) Testing for neglected heterogeneity. Econometrica 52(4), 865–872.Google Scholar

Cho, J. & White, H. (2007) Testing for regime switching. Econometrica 75, 1671–1720.Google Scholar

Dicker, L. & Zhao, S.D. (2016) High-dimensional classification via nonparametric empirical Bayes and maximum likelihood. Biometrika 103, 21–34.Google Scholar

Efron, B. (2011) Tweedie’s formula and selection bias. Journal of the American Statistical Association 106, 1602–1614.Google Scholar

Friberg, H.A. (2012) Rmosek: The R-to-MOSEK Optimization Interface, R package version 1.2.3.Google Scholar

Gassiat, E. (2002) Likelihood ratio inequalities with applications to various mixtures. In Annales de l’Institut Henri Poincare (B) Probability and Statistics, vol. 38, pp. 897–906. Elsevier.Google Scholar

Giacomini, R., Politis, D., & White, H. (2013) A warp-speed method for conducting monte carlo experiments involving bootstrap estimators. Econometric Theory 29(3), 567–589.Google Scholar

Groeneboom, P., Jongbloed, G., & Wellner, J.A. (2008) The support reduction algorithm for computing non-parametric function estimates in mixture models. Scandinavian Journal of Statistics 35, 385–399.Google Scholar

Gu, J. (2016) Neyman’s C(α) test for unobserved heterogeneity. Econometric Theory 32(6), 1483–1522.Google Scholar

Hall, P. & Stewart, M. (2005) Theoretical analysis of power in a two-component normal mixture model. Journal of Statistical Planning and Inference 134, 158–179.Google Scholar

Hartigan, J. (1985) A failure of likelihood asymptotics for normal mixtures. In LeCam, L. & Olshen, R. (eds.), Proceedings of the Berkeley Conference in Honor of Jerzy Neyman and Jack Kiefer, pp. 807–810. Wadsworth.Google Scholar

Heckman, J. & Singer, B. (1984) A method for minimizing the impact of distributional assumptions in econometric models for duration data. Econometrica 52, 63–132.Google Scholar

Jiang, W. & Zhang, C.-H. (2009) General maximum likelihood empirical Bayes estimation of normal means. Annals of Statistics 37, 1647–1684.Google Scholar

Kasahara, H. & Shimotsu, K. (2015) Testing the number of components in normal mixture regression models. Journal of American Statistical Association 110, 1632–1645.Google Scholar

Kiefer, J. & Wolfowitz, J. (1956) Consistency of the Maximum likelihood estimator in the presence of infinitely many incidental parameters. The Annals of Mathematical Statistics 27, 887–906.Google Scholar

Koenker, R. (2013) REBayes: Empirical Bayes Estimation and Inference in R, R package version 0.41.Google Scholar

Koenker, R. & Mizera, I. (2014) Convex optimization, shape constraints, compound decisions and empirical bayes rules. Journal of the American Statistical Association 109(506), 674–685.Google Scholar

Laird, N. (1978) Nonparametric maximum likelihood estimation of a mixing distribution. Journal of the American Statistical Association 73, 805–811.Google Scholar

Ledoux, M. & Talagrand, M. (1991) Probability in Banach Spaces: Isoperimetry and Processes, vol. 23. Springer Science & Business Media.Google Scholar

Lesperance, M.L. & Kalbfleisch, J.D. (1992) An algorithm for computing the nonparametric MLE of a mixing distribution. Journal of the American Statistical Association 87, 120–126.Google Scholar

Li, P. & Chen, J. (2010) Testing the order of a finite mixture. Journal of the American Statistical Association 105, 1084–1092.Google Scholar

Li, P., Chen, J., & Marriott, P. (2009) Non-finite fisher information and homogeneity: The EM approach. Biometrika 96, 411–426.Google Scholar

Lindsay, B. (1981) Properties of the maximum likelihood estimator of a mixing distribution. In Patil, G. (ed.), Statistical Distributions in Scientific Work, vol. 5, pp. 95–109. Reidel.Google Scholar

Lindsay, B. (1983) The geometry of mixture likelihoods: A general theory. Annals of Statistics 11, 86–94.Google Scholar

Lindsay, B. (1995) Mixture Models: Theory, Geometry and Applications. NSF-CBMS-IMS Conference Series in Statistics, Institute of Mathematical Statistics and American Statistical Association.Google Scholar

Liu, X. & Shao, Y. (2003) Asymptotics for likelihood ratio tests under loss of identifiability. Annals of Statistics 31(3), 807–832.Google Scholar

McLachlan, G. (1987) On bootstrapping likelihood ratio test statistics for the number of components in a normal mixture. Journal of the Royal Statistical Society, Series C 36, 318–324.Google Scholar

Moran, P. (1973) Asymptotic properties of homogeneity tests. Biometrika 60(1), 79–85.Google Scholar

Neyman, J. (1959) Optimal asymptotic tests of composite statistical hypotheses. In Grenander, U. (ed.), Probability and Statistics, the Harald Cramer Volume, pp. 213–234. Wiley.Google Scholar

Robbins, H. (1950) A Generalization of the method of maximum likelihood: Estimating a mixing distribution (abstract). The Annals of Mathematical Statistics 21, 314.Google Scholar

Saunders, M.A. (2003) PDCO: A Primal-Dual interior solver for convex optimization. Available at: http://www.stanford.edu/group/SOL/software/pdco.html.Google Scholar

Tsirel’son, V. (1976) The density of the distribution of the maximum of a Gaussian process. Theory of Probability & Its Applications 20(4), 847–856.Google Scholar

van der Vaart, A.W. (1998) Asymptotic Statistics, vol. 3. Cambridge University Press.Google Scholar

van der Vaart, A.W. & Wellner, J.A. (1996) Weak Convergence and Empirical Processes - Springer Series in Statistics. Springer.Google Scholar

Article contents

TESTING FOR HOMOGENEITY IN MIXTURE MODELS

Abstract

Access options

Footnotes

References

REFERENCES

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests