Graphical Models for Categorical Data

Alberto Roverato

doi:10.1017/9781108277495

Series: SemStat Elements

Graphical Models for Categorical Data

Published online by Cambridge University Press: 16 June 2017

Alberto Roverato

Show author details

Alberto Roverato: Affiliation:
Università di Bologna

Summary

For advanced students of network data science, this compact account covers both well-established methodology and the theory of models recently introduced in the graphical model literature. It focuses on the discrete case where all variables involved are categorical and, in this context, it achieves a unified presentation of classical and recent results.

Element contents

Summary
References

Get access

Type: Element
Information: Series: SemStat Elements

DOI: https://doi.org/10.1017/9781108277495 [Opens in a new window]

Online ISBN: 9781108277495

Publisher: Cambridge University Press

Print publication: 24 August 2017

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Bibliography

Agresti, A. (2013). Categorical Data Analysis, 3rd edn, New York: John Wiley and Sons.Google Scholar

Ali, R. A., Richardson, T. S. & Spirtes, P. (2009). Markov equivalence for ancestral graphs. The Annals of Statistics, 37(5B), 2808–37.CrossRef Google Scholar

Anderson, T. W. (1969). Statistical inference for covariance matrices with linear structure. In Multivariate Analysis, II: Proc. 2nd Int. Symp., Dayton, Ohio, 1968. New York: Academic Press, pp. 55–66.Google Scholar

Anderson, T. W. (1973). Asymptotically efficient estimation of covariance matrices with linear structure. The Annals of Statistics, 1(1), 135–41.CrossRef Google Scholar

Andersson, S. A., Madigan, D., Perlman, M. D. (1997). A characterization of Markov equivalence classes for acyclic digraphs. The Annals of Statistics, 25(2), 505–41.Google Scholar

Andersson, S. A., Madigan, D. & Perlman, M. D. (2001). Alternative Markov properties for chain graphs. Scandinavian Journal of Statistics, 28(1), 33–85.Google Scholar

Asmussen, S. & Edwards, D. (1983). Collapsibility and response variables in contingency tables. Biometrika, 70(3), 567–78.CrossRef Google Scholar

Barber, D. (2012). Bayesian Reasoning and Machine Learning. Cambridge: Cambridge University Press.Google Scholar

Barndorff-Nielsen, O. (1978). Information in Exponential Families and Conditioning. New York: John Wiley and Sons.Google Scholar

Barndorff-Nielsen, O. (2014). Information and Exponential Families in Statistical Theory. Chichester: John Wiley and Sons.CrossRef Google Scholar

Bartolucci, F., Colombi, R. & Forcina, A. (2007). An extended class of marginal link functions for modelling contingency tables by equality and inequality constraints. Statistica Sinica, 17(2), 691–711.Google Scholar

Bergsma, W. P. & Rudas, T. (2002). Marginal models for categorical data. The Annals of Statistics, 30(1), 140–59.CrossRef Google Scholar

Birch, M. W. (1963). Maximum likelihood in three-way contingency tables. Journal of the Royal Statistical Society. Series B (Statistical Methodology), 25(1), 220–233.Google Scholar

Bishop, Y. M., Fienberg, S. E. & Holland, P. W. (1975). Discrete Multivariate Analysis: Theory and Practice. Cambridge, MA: MIT Press.Google Scholar

Bishop, Y. M., Fienberg, S. E. & Holland, P. W. (2007). Discrete Multivariate Analysis: Theory and Practice. New York: Springer-Verlag.Google Scholar

Boutilier, C., Friedman, N., Goldszmidt, M. & Koller, D. (1996). Context-specific independence in Bayesian networks.: Proceedings of the Twelfth Annual Conference on Uncertainty in Artificial Intelligence (UAI-96). San Francisco, CA: Morgan Kaufmann, pp. 115–23.Google Scholar

Brown, L. D. (1986). Fundamentals of Statistical Exponential Families with Applications in Statistical Decision Theory. Lecture Notes-monograph series, vol. 9. Hayward, CA: Institute of Mathematical Statistics.CrossRef Google Scholar

Chickering, D. M. (2002). Learning equivalence classes of Bayesian-network structures. Journal of Machine Learning Research, 2, 445–98.Google Scholar

Christensen, R. (1997). Log-linear Models and Logistic Regression, 2nd edn, New York: Springer-Verlag.Google Scholar

Consonni, G. & Leucari, V. (2006). Reference priors for discrete graphical models. Biometrika, 93(1), 23–40.Google Scholar

Coppen, A. (1966). The Marke–Nyman temperament scale: an English translation. British Journal of Medical Psychology, 39(1), 55–9.Google Scholar

Corander, J. (2003). Labelled graphical models. Scandinavian Journal of Statistics, 30(3), 493–508.CrossRef Google Scholar

Cowell, R. G., Dawid, A. P., Lauritzen, S. L. & Spiegelhalter, D. J. (1999). Probabilistic Networks and Expert Systems. New York: Springer-Verlag.Google Scholar

Cox, D. R. & Wermuth, N. (1993). Linear dependencies represented by chain graphs. Statistical Science, 8(3), 204–18.Google Scholar

Cox, D. R. & Wermuth, N. (1996). Multivariate Dependencies: Models, Analysis, and Interpretation. Boca Raton, FL: Chapman & Hall.Google Scholar

Darroch, J. N. & Ratcliff, D. (1972). Generalized iterative scaling for log-linear models. The Annals of Mathematical Statistics, 43(5), 1470–80.Google Scholar

Darroch, J. N., Lauritzen, S. L. & Speed, T. P. (1980). Markov fields and log-linear interaction models for contingency tables. The Annals of Statistics, 8(3), 522–39.Google Scholar

Davison, A. C. (2003). Statistical Models. Vol. 11. Cambridge: Cambridge University Press.Google Scholar

Dawid, A. P. (1979). Conditional independence in statistical theory. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 41, 1–31.Google Scholar

Dawid, A. P. & Lauritzen, S. L. (1993). Hyper Markov laws in the statistical analysis of decomposable graphical models. The Annals of Statistics, 21(3), 1272–317.CrossRef Google Scholar

Deming, W. E. & Stephan, F. F. (1940). On a least squares adjustment of a sampled frequency table when the expected marginal totals are known. The Annals of Mathematical Statistics, 11(4), 427–44.CrossRef Google Scholar

Diestel, R. (1990). Graph Decompositions: A Study in Infinite Graph Theory. Oxford: Clarendon Press.Google Scholar

Drton, M. (2008). Iterative conditional fitting for discrete chain graph models. In Brito, P., ed., COMPSTAT 2008 – Proceedings in Computational Statistics. New York: Springer, pp. 93–104.CrossRef Google Scholar

Drton, M. (2009). Discrete chain graph models. Bernoulli, 15(3), 736–53.Google Scholar

Drton, M. & Maathuis, M. H. (2017). Structure learning in graphical modeling. Annual Review of Statistics and Its Application, 4(1).Google Scholar

Drton, M. & Richardson, T. S. (2008a). Binary models for marginal independence. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 70(2), 287–309.CrossRef Google Scholar

Drton, M. & Richardson, T. S. (2008b). Graphical methods for efficient likelihood inference in Gaussian covariance models. Journal of Machine Learning Research, 9, 893–914.Google Scholar

Drton, M., Lauritzen, S. L., Maathuis, M. & Wainwright, M. (2017). Handbook of Graphical Models. Boca Raton, FL: Chapman and Hall/CRC.Google Scholar

Edwards, D. (2000). Introduction to Graphical Modelling, 2nd edn, New York: Springer-Verlag.CrossRef Google Scholar

Edwards, D. & Kreiner, S. (1983). The analysis of contingency tables by graphical models. Biometrika, 70(3), 553–65.CrossRef Google Scholar

Evans, R. J. (2016). Graphs for margins of Bayesian networks. Scandinavian Journal of Statistics, 43(3), 625–48.Google Scholar

Evans, R. J. & Forcina, A. (2013). Two algorithms for fitting constrained marginal models. Computational Statistics & Data analysis, 66, 1–7.Google Scholar

Evans, R. J. & Richardson, T. S. (2010). Maximum likelihood fitting of acyclic directed mixed graphs to binary data. In Grunwald, P. & Spirtes, P., eds, Proceedings of the 26th Conference on Uncertainty in Artificial Intelligence (UAI 2010). Corvallis, OR: AUAI Press, pp. 177–84.Google Scholar

Evans, R. J. & Richardson, T. S. (2013). Marginal log-linear parameters for graphical Markov models. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 75(4), 743–68.Google Scholar

Evans, R. J. & Richardson, T. S. (2014). Markovian acyclic directed mixed graphs for discrete data. The Annals of Statistics, 42(4), 1452–82.Google Scholar

Frydenberg, M. (1990). The chain graph Markov property. Scandinavian Journal of Statistics, 17(4), 333–353.Google Scholar

Frydenberg, M. & Lauritzen, S. L. (1989). Decomposition of maximum likelihood in mixed graphical interaction models. Biometrika, 76(3), 539–55.Google Scholar

Geiger, D. & Meek, C. (1998). Graphical models and exponential families. In Proceedings of the Fourteenth Annual Conference on Uncertainty in Artificial Intelligence (UAI-98). San Francisco, CA: Morgan Kaufmann, pp. 156–65.Google Scholar

Geiger, D. & Pearl, J. (1988). On the logic of causal models. In Uncertainty in Artificial Intelligence 4 Annual Conference on Uncertainty in Artificial Intelligence (UAI-88). Amsterdam: Elsevier Science, pp. 3–14.Google Scholar

Geiger, D. & Pearl, J. (1993). Logical and algorithmic properties of conditional independence and graphical models. The Annals of Statistics, 21(4), 2001–21.Google Scholar

Graybill, F. A. (1983). Matrices with Applications in Statistics. Belmont, CA: Wadsworth.Google Scholar

Gutiérrez-Peña, E. & Smith, A. F. M. (1997). Exponential and Bayesian conjugate families: review and extensions. Test, 6(1), 1–90.Google Scholar

Hall, P. (1934). A contribution to the theory of groups of prime-power order. Proceedings of the London Mathematical Society, 2(1), 29–95.Google Scholar

Hammersley, J. M. & Clifford, P. (1971). Markov fields on finite graphs and lattices. Unpublished manuscript.Google Scholar

Højsgaard, S. (2004). Statistical inference in context specific interaction models for contingency tables. Scandinavian Journal of Statistics, 31(1), 143–58.Google Scholar

Højsgaard, S., Edwards, D. & Lauritzen, S. L. (2012). Graphical Models with R. New York: Springer Science+Business Media.Google Scholar

Jeffreys, H. (1946). An invariant form for the prior probability in estimation problems. Proceedings of the Royal Society of London A: Mathematical, Physical and Engineering Sciences, 186(1007), 453–61.Google Scholar PubMed

Jeffreys, H. (1961). Theory of Probability, 3rd edn. Oxford Classic Texts in the Physical Sciences. Oxford: Oxford University Press.Google Scholar

Jokinen, J. (2006). Fast estimation algorithm for likelihood-based analysis of repeated categorical responses. Computational Statistics & Data Analysis, 51(3), 1509–22.CrossRef Google Scholar

Kass, R. E. & Raftery, A. E. (1995). Bayes factors. Journal of the American Statistical Association, 90(430), 773–95.CrossRef Google Scholar

Kauermann, G. (1996). On a dualization of graphical Gaussian models. Scandinavian Journal of Statistics, 23(1), 105–16.Google Scholar

Kauermann, G. (1997). A note on multivariate logistic models for contingency tables. Australian Journal of Statistics, 39(3), 261–76.CrossRef Google Scholar

Koski, T. and Noble, J. M. (2009). Graphical models and exponential families. In Bayesian Networks: An Introduction. Chichester: John Wiley and Sons, Ltd, chapter 8.Google Scholar

La Rocca, L. & Roverato, A. (2017). Discrete graphical models. In Drton, M., Lauritzen, S. L., Maathuis, M. & Wainwright, M., eds, Handbook of Graphical Models. Handbooks of Modern Statistical Methods. Boca Raton, FL: Chapman and Hall/CRC.Google Scholar

Lang, J. B. (1996. Maximum likelihood methods for a generalized class of log-linear models. The Annals of Statistics, 24(2), 726–52.CrossRef Google Scholar

Lauritzen, S. L. (1996). Graphical models. Oxford: Clarendon Press.Google Scholar

Lauritzen, S. L. (2001). Causal inference from graphical models. In Barndorff-Nielsen, O.E., Cox, D. R. & Klüppelberg, C., eds, Complex Stochastic Systems. London/Boca Raton: Chapman and Hall/CRC Press, pp. 63–107.Google Scholar

Lauritzen, S. L. & Richardson, T. S. (2002). Chain graph models and their causal interpretations. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 64(3), 321–48.Google Scholar

Lauritzen, S. L. & Wermuth, N. (1989). Graphical models for associations between variables, some of which are qualitative and some quantitative. The Annals of Statistics, 17(1), 31–57.Google Scholar

Lauritzen, S. L., Dawid, A. P., Larsen, B. N. & Leimer, H.-G. (1990). Independence properties of directed Markov fields. Networks, 20(5), 491–505.Google Scholar

Lovász, L. (1993). Combinatorial Problems and Exercises, 2nd edn. Amsterdam: North-Holland.Google Scholar

Lupparelli, M. & Roverato, A. (2017). Log-mean linear regression models for binary responses with an application to multimorbidity. Journal of the Royal Statistical Society: Series C (Applied Statistics), 66(2), 227–252.Google Scholar

Lupparelli, M., Marchetti, G. M. & Bergsma, W. P. (2009). Parameterizations and fitting of bi-directed graph models to categorical data. Scandinavian Journal of Statistics, 36(3), 559–76.CrossRef Google Scholar

Lütkepol, H. (1996). Handbook of Matrices. Chichester: Wiley.Google Scholar

Madsen, M. (1976). Statistical analysis of multiple contingency tables. Two examples. Scandinavian Journal of Statistics, 3(3), 97–106.Google Scholar

Marchetti, G. M. & Lupparelli, M. (2011). Chain graph models of multivariate regression type for categorical data. Bernoulli, 17(3), 827–44.CrossRef Google Scholar

Massam, H., Liu, J. & Dobra, A. (2009). A conjugate prior for discrete hierarchical log-linear models. The Annals of Statistics, 37(6), 3431–67.Google Scholar

Meek, C. (1995). Causal inference and causal explanation with background knowledge. In Proceedings of the Eleventh Annual Conference on Uncertainty in Artificial Intelligence (UAI-95). San Francisco, CA: Morgan Kaufmann, pp. 403–10.Google Scholar

Morris, C. N. (1982). Natural exponential families with quadratic variance functions. The Annals of Statistics, 10(1), 65–80.Google Scholar

Morris, C. N. (1983). Natural exponential families with quadratic variance functions: statistical theory. The Annals of Statistics, 11(2), 515–29.Google Scholar

Nyman, H., Pensar, J., Koski, T. & Corander, J. (2014). Stratified graphical models – context-specific independence in graphical models. Bayesian Analysis, 9(4), 883–908.Google Scholar

Pearl, J. (1986). Fusion, propagation, and structuring in belief networks. Artificial Intelligence, 29(3), 241–88.Google Scholar

Pearl, J. (1988). Probabilistic Reasoning in Intelligent Systems. San Mateo, CA: Morgan Kaufmann.Google Scholar

Pearl, J. (2009). Causality, 2nd edn, Cambridge: Cambridge University Press.Google Scholar

Pearl, J. & Paz, A. (1987). Graphoids: a graph-based logic for reasoning about relevancy relations. In Boulary, B. D., Hogg, D. &Steel, L., eds, Advances in Artificial Intelligence – II. Amsterdam: North-Holland, pp. 357–63.Google Scholar

Pearl, J. & Verma, T. (1990). Equivalence and synthesis of causal models. In Uncertainty in Artificial Intelligence 6 Annual Conference on Uncertainty in Artificial Intelligence (UAI-90). Amsterdam: Elsevier Science, pp. 255–68.Google Scholar

Piccioni, M. (2000). Independence structure of natural conjugate densities to exponential families and the Gibbs’ sampler. Scandinavian Journal of Statistics, 27(1), 111–27.Google Scholar

R Core Team. (2016). R: A Language and Environment for Statistical Computing. Vienna: Foundation for Statistical Computing.Google Scholar

Richardson, T. & Spirtes, P. (2002). Ancestral graph Markov models. The Annals of Statistics, 30(4), 962–1030.Google Scholar

Richardson, T. S. (2003). Markov properties for acyclic directed mixed graphs. Scandinavian Journal of Statistics, 30(1), 145–57.CrossRef Google Scholar

Rota, G.-C. (1964). On the foundations of combinatorial theory I. Theory of Möbius functions. Probability Theory and Related Fields, 2(4), 340–68.Google Scholar

Roverato, A. (2005). A unified approach to the characterization of equivalence classes of DAGs, chain graphs with no flags and chain graphs. Scandinavian Journal of Statistics, 32(2), 295–312.CrossRef Google Scholar

Roverato, A. (2015). Log-mean linear parameterization for discrete graphical models of marginal independence and the analysis of dichotomizations. Scandinavian Journal of Statistics, 42(2), 627–48.CrossRef Google Scholar

Roverato, A. & La Rocca, L. (2006). On block ordering of variables in graphical modelling. Scandinavian Journal of Statistics, 33(1), 65–81.Google Scholar

Roverato, A. & Studenỳ, M. (2006). A graphical representation of equivalence classes of AMP chain graphs. Journal of Machine Learning Research, 7, 1045–78.Google Scholar

Roverato, A. & Whittaker, J. (1998). The Isserlis matrix and its application to non-decomposable graphical Gaussian models. Biometrika, 85(3), 711–25.CrossRef Google Scholar

Roverato, A., Lupparelli, M. & La Rocca, L. (2013). Log-mean linear models for binary data. Biometrika, 100(2), 485–94.Google Scholar

Rudas, T., Bergsma, W. P. & Németh, R. (2010). Marginal log-linear parameterization of conditional independence models. Biometrika, 97(4), 1006–12.Google Scholar

Sadeghi, K. & Lauritzen, S. L. (2014). Markov properties for mixed graphs. Bernoulli, 20(2), 676–96.Google Scholar

Sadeghi, K. & Wermuth, N. (2016). Pairwise Markov properties for regression graphs. Stat, 5, 286–94.Google Scholar

Speed, T. P. (1983). Cumulants and partition lattices. Australian Journal of Statistics, 25(2), 378–88.Google Scholar

Spirtes, P., Glymour, C. & Scheines, R. (2000). Causation, Prediction, and Search, 2nd edn, Cambridge, MA: MIT Press.Google Scholar

Studenỳ, M. (2005). Probabilistic Conditional Independence Structures. London: Springer-Verlag.Google Scholar

Tarjan, R. E. & Yannakakis, M. (1984). Simple linear-time algorithms to test chordality of graphs, test acyclicity of hypergraphs, and selectively reduce acyclic hypergraphs. SIAM Journal on Computing, 13(3), 566–79.Google Scholar

Volf, M. & Studenỳ, M. (1999). A graphical characterization of the largest chain graphs. International Journal of Approximate Reasoning, 20(3), 209–36.Google Scholar

Weisner, L. (1935). Abstract theory of inversion of finite series. Transactions of the American Mathematical Society, 38(3), 474–84.CrossRef Google Scholar

Wermuth, N. (1976). Model search among multiplicative models. Biometrics, 32(2), 253–63.Google Scholar

Wermuth, N. & Cox, D. R. (2015). Graphical Markov models: overview. In Wright, J. D., ed., International Encyclopedia of the Social and Behavioral Sciences, 2nd edn, vol. 10. Oxford: Elesevier, pp. 341–50.Google Scholar

Wermuth, N. & Lauritzen, S. L. (1990). On substantive research hypotheses, conditional independence graphs and graphical chain models. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 52(1), 21–50.Google Scholar

Wermuth, N. & Sadeghi, K. (2012). Sequences of regressions and their independences. TEST, 21(2), 215–52.Google Scholar

Whittaker, J. (1990). Graphical Models in Applied Multivariate Statistics. Chichester: Wiley.Google Scholar

Wright, S. (1921). Correlation and causation. Journal of Agricultural Research, 20(7), 557–85.Google Scholar

Element contents

Graphical Models for Categorical Data

Summary

Access options

Bibliography

Save element to Kindle

Save element to Dropbox

Save element to Google Drive