To save content items to your account,
please confirm that you agree to abide by our usage policies.
If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account.
Find out more about saving content to .
To save content items to your Kindle, first ensure no-reply@cambridge.org
is added to your Approved Personal Document E-mail List under your Personal Document Settings
on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part
of your Kindle email address below.
Find out more about saving to your Kindle.
Note you can select to save to either the @free.kindle.com or @kindle.com variations.
‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi.
‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.
An extensive list of special univariate distributions is studied, and their main properties analyzed. They are the distributions most often encountered in statistics, and we relate them to one another. Some of these distributions arise out of natural phenomena or have attractive special properties which are explored in the exercises. We also study general classifications of distributions and the ensuing properties, including the exponential family, infinitely divisible distributions, and stable distributions. Distributions contain an inherent amount of information and entropy, which we quantify.
If a vector of variates is transformed into anothervector, what is the resulting distribution? This chapter details the main methods of making such transformations and obtaining the new distribution, density, and characteristic functions. Although the proof of the transformation theorem for densities is not usually given in statistics textbooks, we use the shortcut of conditioning to give a statistical proof. Applications of the three methods range from simple transformation (including convolutions) to products and ratios. General transformations are also studied. These include rotations of vectors (and their Jacobian), transformations from uniform variates to others (useful for generating simulated samples in Monte-Carlo studies) via the probability integral transformation (PIT), exponential tilting, and others. Extreme-value distributions are studied, to complement the study of sample averages. The chapter concludes by revisiting the copula, which transforms the marginals into a joint distribution.
This appendix collects mathematical tools that are needed in the main text. In addition, it gives a brief description of some essential background topics. It is assumed that the reader knows elementary calculus. The topics are grouped in four sections. First, we consider some useful methods of indirect proofs. Second, we introduce basic results for complex numbers and polynomials. The third topic concerns series expansions. Finally, some further calculus is presented, including difference calculus, Stieltjes integrals, and multivariable constrained optimization (covering also the case of inequality constraints).
In the previous chapter, we introduced distributions and density functions as two alternative methods to characterize the randomness of variates. In this chapter, we introduce the final method considered in this book, and relate it to the previous two: the moments of a variate (including mean, variance, and higher-order moments). The condition for the existence of moments is explained and justified mathematically. These moments can be summarized by means of "generating functions". We define moment-generating functions (m.g.f.s). Cumulants and their generating functions are introduced. Characteristic functions (c.f.s) always exist, even if some moments do not, and they identify uniquely the distribution of the variate, so we define c.f.s and the inversion theorem required to transform them into the c.d.f. of a variate. We also study the main inequalities satisfied by moments, such as those resulting from transformations (Jensen) or from comparing moments to probabilities (Markov, Chebyshev). We also show that the mean, median, and mode need not be linked by inequalities, as previously thought.
Suppose one has a set of data that arises from a specific distribution with unknown parameter vector. A natural question to ask is the following: what value of this vector is most likely to have generated these data? The answer to this question is provided by the maximum-likelihood estimator (MLE). Likelihood and related functions are the subject of this chapter. It will turn out that we have already seen some examples of MLEs in the previous chapters. Here, we define likelihood, the score vector, the Hessian matrix, the information-matrix equivalence, parameter identification, the Cramér–Rao lower bound and its extensions, profile (concentrated) likelihood and its adjustments, as well as the properties of MLEs (including conditions for existence, consistency, and asymptotic normality) and the score (including martingale representation and local sufficiency). Applications are given, including some for the normal linear model.
Abadir and Magnus (2002, Econometric Theory) proposed a standard for notation in econometrics. The consistent use of the proposed notation in our volumes shows that it is in fact practical. The notational conventions described here mainly apply to the material covered in this volume. Further notation will be introduced, as needed, as the Series develops.
There is a proliferation of methods of point estimation other than ML. First, MLEs may not have an explicit formula and may be computationally more demanding than alternatives. Second, MLEs typically require the specification of a distribution. Third, optimization of criteria other than the likelihood may have some justification. The first argument has become less relevant with the advent of fast computers, and the alternative estimators based on it usually entail a loss of optimality properties. The second can be countered to some extent with large-sample invariance arguments or with the nonparametric MLE and empirical likelihood seen earlier. However, the third reason can be more fundamental.This chapter presents a selection of four common methods of point estimation, addressing the reasons outlined earlier, to varying degrees: method of moments, least squares, nonparametric (density and regression), and Bayesian estimation methods. In addition to these reasons for alternative estimators, point estimation itself may not be the most informative way to summarize what the data indicate about the parameters. Therefore, the chapter also introduces interval estimation and its multivariate generalization, a topic that leads quite naturally to the subject matter of Chapter 14.
This chapter concerns the measurement of the dependence between variates, by exploiting the additional information contained in joint (rather than just marginal) distribution and density functions. For this multivariate context, we also generalize the third description of randomness seen earlier, i.e., moments and their generating functions.Joint moments and their generating functions are introduced, along with covariances, variance matrices, the Cauchy–Schwarz inequality, and joint c.f.s and their inversion into joint densities. We show how the law of iterated expectations makes use of conditioning when taking expectations with respect to more than one variate. We measure dependence via conditional densities, distributions, moments, and cumulants.
We introduce elementary concepts of sets, probability, and events. We then study and illustrate the basic properties of probability. We use probability to characterize independent events and mutually exclusive events. We study conditioning and Bayes' law. We also introduce essential functions required to calculate probabilities, including the factorial, gamma, and beta functions. We then apply them to calculating combinations and permutations.
This chapter is devoted to the multivariate normal and functions of it. We start by showing how linearity is essential to its definition, then we derive the main properties. These include characteristic and density functions, conditionals, and some of the normal distribution's exceptional properties: the equivalence of no-correlation and independence within the class of elliptical distributions, Cramér's deconvolution theorem, the equivalence of a random sample's normality with the independence of the sample's normal mean and chi-square variance. We also explore other properties such as fourth-order moments in multivariate normal (and elliptical) distributions, the convexity of the m.g.f., joint distributions of linear and quadratic forms and conditions for their independence, the same also for pairs of quadratic forms and their covariance, as well as decompositions of quadratic forms.
The need for this chapter arises once we start considering the realistic case of more than one variate at a time, the multivariate case. We have already started dealing with this topic (in disguise) in the introductions to conditioning and mixing in Chapters 1 and 2, and in some of the exercises using these ideas in Chapter 4. Joint distributions are defined, and we explain their relation to the univariate distributions seen earlier and more generally to the distribution of subsets (marginal distributions). Joint densities are also defined. The independence of variates is defined in terms of their joint distribution. We also introduce the concept of copulas, linking the joint distribution to its marginals.