To save content items to your account,
please confirm that you agree to abide by our usage policies.
If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account.
Find out more about saving content to .
To save content items to your Kindle, first ensure no-reply@cambridge.org
is added to your Approved Personal Document E-mail List under your Personal Document Settings
on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part
of your Kindle email address below.
Find out more about saving to your Kindle.
Note you can select to save to either the @free.kindle.com or @kindle.com variations.
‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi.
‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.
This chapter explores methods of concentration that do not rely on independence. We introduce the isoperimetric approach and discuss concentration inequalities across a variety of metric measure spaces – including the sphere, Gaussian space, discrete and continuous cubes, the symmetric group, Riemannian manifolds, and the Grassmannian. As an application, we derive the Johnson–Lindenstrauss lemma, a fundamental result in dimensionality reduction for high-dimensional data. We then develop matrix concentration inequalities, with an emphasis on the matrix Bernstein inequality, which extends the classical Bernstein inequality to random matrices. Applications include community detection in sparse networks and covariance estimation for heavy-tailed distributions. Exercises explore binary dimension reduction, matrix calculus, additional matrix concentration results, and matrix sketching.
Measurement models are the focus of Chapter 5. It treats the nature of concepts, theoretical definitions, and latent variables. Chapter 5 explains model specification, implied moments, model identification, model estimation, and model interpretation, fit, and diagnostics in confirmatory factor analysis (CFA) models. Factor score prediction and respecification of models are two other topics it covers.
Chapter 3 concentrates on single equation regression models but presents them from the perspective of structural equations models. It introduces and applies the major steps of structural equation modeling: model specification, implied moments, model identification, model estimation, and model interpretation and fit. It also includes diagnostics and testing for regression and a discussion of the consequences of using multiple regression with variables measured with errors.
In the Preface, I wrote that the primary purpose of the book was to provide readers with a solid foundation in structural equation models (SEMs). I had several audiences in mind. One was those who desired to be more informed users of SEMs. These readers aspire to an understanding that goes beyond the input commands and output of SEM programs. I also hoped to reach quantitative methodologists who sought to master and to create new tools for SEMs. Finally, I aimed to compose a resource for statisticians, biostatisticians, and data scientists who wished to learn about latent variable modeling with multiple indicators and systems of equations. For those who have made it this far, I hope that your knowledge of SEMs is much deeper than before.
Distributed ledgers, including blockchain and other decentralized databases, are designed to store information online where all trusted network members can update the data with transparency. The dynamics of a ledger’s development can be mathematically represented by a directed acyclic graph (DAG). In this paper, we study a DAG model that considers batch arrivals and random delay of attachment. We analyze the asymptotic behavior of this model by letting the arrival rate go to infinity and the inter-arrival time go to zero. We establish that the number of leaves in the DAG, as well as various random variables characterizing the vertices in the DAG, can be approximated by its fluid limit, represented as the solution to a set of delayed partial differential equations. Furthermore, we establish the stable state of this fluid limit and validate our findings through simulations.
We consider shock models governed by the bivariate geometric counting process. By assuming the competing risks framework, failures are due to one of two mutually exclusive causes (shocks). We obtain and study some relevant functions, such as failure densities, survival functions, probability of the cause of failure, and moments of the failure time conditioned on a specific cause. Such functions are specified by assuming that systems or living organisms fail at the first instant in which a random threshold is reached by the sum of received shocks. Under this failure scheme, various cases arising for suitable choices of the random threshold are provided too.
The relevation model is a fundamental tool in reliability engineering for assessing the effectiveness of redundancy allocation in coherent systems. In this study, we address the problem of allocation of relevations for one or two nodes in a coherent system with independent components to enhance system reliability. We establish results concerning the usual stochastic and hazard rate orders for coherent systems. Moreover, we illustrate our findings with a range of examples and counterexamples. In addition, we conduct a simulation-based study and a real data analysis to further illustrate the application of our results. Lastly, we study the case of the minimal repair policy in detail.
This article proposes and studies two Huber-type estimation approaches, namely, the Huber instrumental variable (IV) estimation and the Huber generalized method of moments (GMM) estimation, for a spatial autoregressive model. We establish the consistency, asymptotic distributions, finite sample breakdown points, and influence functions of these estimators. Simulation studies show that compared to the corresponding traditional estimators (the two-stage least squares estimator, the best IV estimator, and the GMM estimator), our estimators are more robust when the unknown disturbances are long-tailed, and our estimators only lose a little efficiency when the disturbances are short-tailed. Moreover, the Huber GMM estimator also outperforms several robust estimators in the literature. Finally, we apply our estimation method to investigate the impact of the urban heat island effect on housing prices. A package is published on GitHub for practitioners to use in their empirical studies.
A general asymptotic theory is established for sample cross moments of nonstationary time series, allowing for long-range dependence and local unit roots. The theory provides a substantial extension of earlier results on nonparametric regression that include near-cointegrated nonparametric regression as well as spurious nonparametric regression. Many new models are covered by the limit theory, among which are functional coefficient regressions in which both regressors and the functional covariate are nonstationary. Simulations show finite sample performance matching well with the asymptotic theory and having broad relevance to applications, while revealing how dual nonstationarity in regressors and covariates raises sensitivity to bandwidth choice and the impact of dimensionality in nonparametric regression. An empirical example is provided involving climate data regression to assess Earth’s climate sensitivity to CO$_2$, where nonstationarity is a prominent feature of both the regressors and covariates in the model. To our knowledge, this application is the first nonparametric empirical analysis to assess potential nonlinear impacts of CO$_2$ on Earth’s climate while allowing for nonstationarity in both the regressors and covariates.
In recent works on the theory of machine learning, it has been observed that heavy tail properties of stochastic gradient descent (SGD) can be studied in the probabilistic framework of stochastic recursions. In particular, Gürbüzbalaban et al. (2021) considered a setup corresponding to linear regression for which iterations of SGD can be modelled by a multivariate affine stochastic recursion $X_n=A_nX_{n-1}+B_n$ for independent and identically distributed pairs $(A_n,B_n)$, where $A_n$ is a random symmetric matrix and $B_n$ is a random vector. However, their approach is not completely correct and, in the present paper, the problem is put into the right framework by applying the theory of irreducible-proximal matrices.
Following Entman’s observation that policy frames define social problems, diagnose causes and suggest remedies, we examined the strategies that 12 U.S. governors (from states matched according to population size and density, demographic composition, per capita incomes, geographic proximity, and COVID-19 incidence) used to frame COVID-19 policy agendas. After scraping the governors’ statements about COVID-19 from press releases issued from January 2020 to May 2023 (N = 14,629), we leveraged ChatGPT (GPT) to identify and assess the intensity of public health, economic stability, and civic vitality frames. Subsequent analysis explored differences in the framing strategies according to the governors’ political party and gender. In the process, this study underscores the importance of AI prompt engineering to realize GPT’s transformative potential to facilitate communication research by efficiently identifying and assessing the content of policy frames.