Search results for Statistical theory and methods

1 - Introduction
A. Colin Cameron, University of California, Davis, Pravin K. Trivedi, Indiana University
Book:

Regression Analysis of Count Data

Published online:

05 January 2013

Print publication:

28 September 1998, pp 1-18
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

God made the integers, all the rest is the work of man.
Kronecker
This book is concerned with models of event counts. An event count refers to the number of times an event occurs, for example the number of airline accidents or earthquakes. An event count is the realization of a nonnegative integer-valued random variable. A univariate statistical model of event counts usually specifies a probability distribution of the number of occurrences of the event known up to some parameters. Estimation and inference in such models are concerned with the unknown parameters, given the probability distribution and the count data. Such a specification involves no other variables and the number of events is assumed to be independently identically distributed (iid). Much early theoretical and applied work on event counts was carried out in the univariate framework. The main focus of this book, however, is regression analysis of event counts.
The statistical analysis of counts within the framework of discrete parametric distributions for univariate iid random variables has a long and rich history (Johnson, Kotz, and Kemp, 1992). The Poisson distribution was derived as a limiting case of the binomial by Poisson (1837). Early applications include the classic study of Bortkiewicz (1898) of the annual number of deaths from being kicked by mules in the Prussian army. A standard generalization of the Poisson is the negative binomial distribution. It was derived by Greenwood and Yule (1920), as a consequence of apparent contagion due to unobserved heterogeneity, and by Eggenberger and Polya (1923) as a result of true contagion.

Frontmatter
A. Colin Cameron, University of California, Davis, Pravin K. Trivedi, Indiana University
Book:

Regression Analysis of Count Data

Published online:

05 January 2013

Print publication:

28 September 1998, pp i-viii
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation

2 - Model Specification and Estimation
A. Colin Cameron, University of California, Davis, Pravin K. Trivedi, Indiana University
Book:

Regression Analysis of Count Data

Published online:

05 January 2013

Print publication:

28 September 1998, pp 19-58
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

Introduction
The general modeling approaches most often used in count data analysis – likelihood-based, generalized linear models, and moment-based – are presented in this chapter. Statistical inference for these nonlinear regression models is based on asymptotic theory, which is also summarized.
The models and results vary according to the strength of the distributional assumptions made. Likelihood-based models and the associated maximum likelihood estimator require complete specification of the distribution. Statistical inference is usually performed under the assumption that the distribution is correctly specified.
A less parametric analysis assumes that some aspects of the distribution of the dependent variable are correctly specified while others are not specified, or if specified are potentially misspecified. For count data models considerable emphasis has been placed on analysis based on the assumption of correct specification of the conditional mean, or on the assumption of correct specification of both the conditional mean and the conditional variance. This is a nonlinear generalization of the linear regression model, where consistency requires correct specification of the mean and efficient estimation requires correct specification of the mean and variance. It is a special case of the class of generalized linear models, widely used in the statistics literature. Estimators for generalized linear models coincide with maximum likelihood estimators if the specified density is in the linear exponential family. But even then the analytical distribution of the same estimator can differ across the two approaches if different second moment assumptions are made.

11 - Nonrandom Samples and Simultaneity
A. Colin Cameron, University of California, Davis, Pravin K. Trivedi, Indiana University
Book:

Regression Analysis of Count Data

Published online:

05 January 2013

Print publication:

28 September 1998, pp 326-343
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

Introduction
This chapter deals with the topic of valid inference about the population given samples that are not simple random samples. There are several well-known ways in which departures from simple random sampling occur. They include choice-based sampling and endogenous stratified sampling, endogenous regressors, and sample selection.
The departure from simple random sampling may cause the sample probability of observations to differ from the corresponding population probabilities. In general such a divergence leads to models in which simple conditioning on exogenous variables does not lead to consistent estimates of the population parameters. These topics have been studied in depth in the discrete choice literature (Manski and McFadden, 1981). The analysis of count data in the presence of such complications is relatively underexplored.
A second topic considered in this chapter is endogenous regressors. Ignoring the feedback from the response variable to the endogenous regressor leads in general to invalid inferences. The estimation procedure should allow for stochastic dependence between the response variable and endogenous regressors. In considering this issue the existing literature on simultaneous equation estimation in nonlinear models is of direct relevance (Amemiya, 1985). This material is a continuation of section 8.2.
The third topic considered is sample selection in count regression, which also is closely related to issues of simultaneity and nonrandom sampling.

5 - Model Evaluation and Testing
A. Colin Cameron, University of California, Davis, Pravin K. Trivedi, Indiana University
Book:

Regression Analysis of Count Data

Published online:

05 January 2013

Print publication:

28 September 1998, pp 139-188
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

Introduction
It is desirable to analyze count data using a cycle of model specification, estimation, testing, and evaluation. This cycle can go from specific to general models – for example, it can begin with Poisson and then test for negative binomial – or one can use a general to specific approach – for example, begin with negative binomial and then test the restrictions imposed by Poisson. For inclusion of regressors in a given count model either approach might be taken; for choice of the count data model itself other than simple choices such as Poisson or negative binomial the former approach is most often useful. For example, if the negative binomial model is inadequate, there is a very wide range of models that might be considered, rendering a general-to-specific approach difficult to implement.
The preceding two chapters have presented the specification and estimation components of this cycle for cross-section count data. In this chapter we focus on the testing and evaluation aspects of this cycle. This includes residual analysis, goodness-of-fit measures, and moment-based specification tests, in addition to classical statistical inference.
Residual analysis, based on a range of definitions of the residual for heteroskedastic data such as counts, is presented in section 5.2. A range of measures of goodness of fit, including pseudo R-squareds and a chi-square goodness-of-fit statistic, are presented in section 5.3. Likelihood-based hypothesis tests for overdispersion, introduced in section 3.4, are discussed more extensively in section 5.4. Small-sample corrections, including the bootstrap pairs procedure for quite general cross-section data models, are presented in section 5.5.

6 - Empirical Illustrations
A. Colin Cameron, University of California, Davis, Pravin K. Trivedi, Indiana University
Book:

Regression Analysis of Count Data

Published online:

05 January 2013

Print publication:

28 September 1998, pp 189-220
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

Introduction
In this chapter we provide a detailed discussion of empirical models based on two cross-sectional data sets. The first of these analyzes the demand for medical care by the elderly in the United States. This data set shares many features of health utilization studies based on cross-section data. The second is an analysis of recreational trips.
Section 6.2 extends the introduction by surveying two general modeling issues. The first is the decision to model only the conditional mean versus the full distribution of counts. The second issue concerns behavioral interpretation of count models, an issue of importance to econometricians who emphasize the distinction between reduced form and structural models. Sections 6.3 and 6.4 deal in turn with each of the two empirical applications. Each has several subsections that deal with details. The health care example in section 6.3 is intended to illustrate in detail the methodology for fitting a finite mixture model. There are relatively few econometric examples that discuss at length the implementation of the finite mixture model and the interpretation of the results. The example is intended to fill this gap. Section 6.5 pursues a methodological question concerning the distribution of the LR test under nonstandard conditions, previously raised in section 4.8.5. The final two sections provide concluding remarks and bibliographic notes. The emphasis of this chapter is on practical aspects of modeling. Each application involves several competing models which are compared and evaluated using model diagnostics and goodness-of-fit measures.

Appendices
A. Colin Cameron, University of California, Davis, Pravin K. Trivedi, Indiana University
Book:

Regression Analysis of Count Data

Published online:

05 January 2013

Print publication:

28 September 1998, pp -
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation

Titles in the series
A. Colin Cameron, University of California, Davis, Pravin K. Trivedi, Indiana University
Book:

Regression Analysis of Count Data

Published online:

05 January 2013

Print publication:

28 September 1998, pp 412-412
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation

B - Functions, Distributions, and Moments
A. Colin Cameron, University of California, Davis, Pravin K. Trivedi, Indiana University
Book:

Regression Analysis of Count Data

Published online:

05 January 2013

Print publication:

28 September 1998, pp 374-377
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation

9 - Longitudinal Data
A. Colin Cameron, University of California, Davis, Pravin K. Trivedi, Indiana University
Book:

Regression Analysis of Count Data

Published online:

05 January 2013

Print publication:

28 September 1998, pp 275-300
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

Introduction
Longitudinal data or panel data are observations on a cross-section of individual units such as persons, households, firms, and regions that are observed over several time periods. The data structure is similar to that of multivariate data considered in Chapter 8. Analysis is simpler than for multivariate data because for each individual unit the same outcome variable is observed, rather than several different outcome variables. Analysis is more complex because this same outcome variable is observed at different points in time, introducing time series data considerations presented in Chapter 7.
In this chapter we consider longitudinal data analysis if the dependent variable is a count variable. Remarkably, many count regression applications are to longitudinal data rather than simpler cross-section data. Econometrics examples include the number of patents awarded to each of many individual firms over several years, the number of accidents in each of several regions, and the number of days of absence for each of many persons over several years. A political science example is the number of protests in each of several different countries over many years. A biological and health science example is the number of occurrences of a specific health event, such as seizure, for each of many patients in each of several time periods.
A key advantage of longitudinal data over cross-section data is that they permit more general types of individual heterogeneity. Excellent motivation was provided by Neyman (1965), who pointed out that panel data enable one to control for heterogeneity and thereby distinguish between true and apparent contagion.

A - Notation and Acronyms
A. Colin Cameron, University of California, Davis, Pravin K. Trivedi, Indiana University
Book:

Regression Analysis of Count Data

Published online:

05 January 2013

Print publication:

28 September 1998, pp 371-373
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

AIC: Akaike information criterion
ARMA: autoregressive moving average
BHHH: Berndt-Hall-Hall-Hausman algorithm
BIC: Bayes information criterion
BP: binary Poisson
Boot: bootstrap
CAIC: consistent Akaike information criterion
CB: correlated binomial
cdf: cumulative distribution function
CFMNB: slope-constrained finite mixture of negative binomials
CFMP: slope-constrained finite mixture of Poissons
CM: conditional moment (function or test)

C - Software
A. Colin Cameron, University of California, Davis, Pravin K. Trivedi, Indiana University
Book:

Regression Analysis of Count Data

Published online:

05 January 2013

Print publication:

28 September 1998, pp 378-378
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

Many widely used regression packages, including LIMDEP, STATA, TSP, and GAUSS, support maximum likelihood estimation of standard Poisson and negative binomial regressions, the latter of these in a separate count module. LIMDEP also supports the QGPML versions of the standard models, maximum-likelihood estimation of truncated or censored Poisson, geometric and negative binomial models, and ZIP and sample selection models. STATA also supports the generalized negative binomial regression in which the overdispersion parameter is further parameterized as a function of additional covariates. In addition, any statistical package with a generalized linear models component will include maximum likelihood and QGPML estimation of the Poisson, although not necessarily negative binomial. Thus, regression packages cover the models in Chapter 3 and roughly half of those in Chapter 4. The packages vary somewhat in the provision of diagnostics such as overdispersion tests and goodness-of-fit measures.
At the time of writing (late 1997) there is virtually no specialized software for the models presented in Chapters 7 through 12. A notable exception is estimation of basic panel count data models, which is provided by both LIMDEP and TSP. For models for which off-the-shelf software is not available, one needs to provide at least the likelihood function, for maximum likelihood estimation, or the moment conditions and weighting matrix, for GMM estimation. In principle this can be done using many regression packages, or using matrix programming languages such as GAUSS, MATLAB, S-PLUS, or SAS/IML. In practice numerical problems can be encountered if models are quite nonlinear.

2 - Asymptotic Efficiency of Nonparametric Goodness-of-Fit Tests
Yakov Nikitin, St Petersburg State University
Book:

Asymptotic Efficiency of Nonparametric Tests

Published online:

18 September 2009

Print publication:

30 June 1995, pp 40-93
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation

3 - Asymptotic Efficiency of Nonparametric Homogeneity Tests
Yakov Nikitin, St Petersburg State University
Book:

Asymptotic Efficiency of Nonparametric Tests

Published online:

18 September 2009

Print publication:

30 June 1995, pp 94-126
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation

Frontmatter
Yakov Nikitin, St Petersburg State University
Book:

Asymptotic Efficiency of Nonparametric Tests

Published online:

18 September 2009

Print publication:

30 June 1995, pp i-iv
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation

Contents
Yakov Nikitin, St Petersburg State University
Book:

Asymptotic Efficiency of Nonparametric Tests

Published online:

18 September 2009

Print publication:

30 June 1995, pp v-viii
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation

Bibliography
Yakov Nikitin, St Petersburg State University
Book:

Asymptotic Efficiency of Nonparametric Tests

Published online:

18 September 2009

Print publication:

30 June 1995, pp 249-264
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation

Index
Yakov Nikitin, St Petersburg State University
Book:

Asymptotic Efficiency of Nonparametric Tests

Published online:

18 September 2009

Print publication:

30 June 1995, pp 265-274
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation

4 - Asymptotic Efficiency of Nonparametric Symmetry Tests
Yakov Nikitin, St Petersburg State University
Book:

Asymptotic Efficiency of Nonparametric Tests

Published online:

18 September 2009

Print publication:

30 June 1995, pp 127-168
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation

1 - Asymptotic Efficiency of Statistical Tests and Mathematical Means for Its Computation
Yakov Nikitin, St Petersburg State University
Book:

Asymptotic Efficiency of Nonparametric Tests

Published online:

18 September 2009

Print publication:

30 June 1995, pp 1-39
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation

Statistical theory and methods

Refine search

Refine search

Actions for selected content:

2348 results in Statistical theory and methods

1 - Introduction

Summary

Frontmatter

2 - Model Specification and Estimation

Summary

11 - Nonrandom Samples and Simultaneity

Summary

5 - Model Evaluation and Testing

Summary

6 - Empirical Illustrations

Summary

Appendices

Titles in the series

B - Functions, Distributions, and Moments

9 - Longitudinal Data

Summary

A - Notation and Acronyms

Summary

C - Software

Summary

2 - Asymptotic Efficiency of Nonparametric Goodness-of-Fit Tests

3 - Asymptotic Efficiency of Nonparametric Homogeneity Tests

Frontmatter

Contents

Bibliography

Index

4 - Asymptotic Efficiency of Nonparametric Symmetry Tests

1 - Asymptotic Efficiency of Statistical Tests and Mathematical Means for Its Computation

Statistical theory and methods

Refine search

Refine search

Actions for selected content:

Save Search

2348 results in Statistical theory and methods

Summary

Summary

Summary

Summary

Summary

Summary

Summary

Summary