To save content items to your account,
please confirm that you agree to abide by our usage policies.
If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account.
Find out more about saving content to .
To save content items to your Kindle, first ensure no-reply@cambridge.org
is added to your Approved Personal Document E-mail List under your Personal Document Settings
on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part
of your Kindle email address below.
Find out more about saving to your Kindle.
Note you can select to save to either the @free.kindle.com or @kindle.com variations.
‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi.
‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.
(a) A research student was proposing an extensive investigation of the decay of seed germination over time. The change of germination rate over two years was to be examined. Seeds would be treated in various different ways, and seeds from many different sources would be used. The major design questions were how often to test the germination rate during the two years, and how many seeds to use on each test occasion. The student, when questioned about the pattern of decay with time, was adamant that all previous data showed clearly that the relationship of germination rate with time was linear. For each combination of seed source and treatment, about 2000 seeds would be available. How should the 2000 seeds be sampled over the two-year period?
(b) In a rather similar investigation into the decline of strength of weldings used for oil rigs, again looking at the pattern over a long period of stress, the proposed design on which comments were requested was that which was regarded by the engineers concerned as too obvious to require statistical advice. As in the seed germination investigation, it was believed absolutely that the relationship of strength with level of stress was linear. The only analysis which was contemplated was to fit a straight line regression of the variable measuring strength on the length of time for which the stress was applied. The ‘obvious’ design proposed by the engineers was to use equal replication for eight, equally spaced, durations of stress. Can the despised statistician offer any improvement?
Given a data set, you can fit thousands of models at the push of a button, but how do you choose the best? With so many candidate models, overfitting is a real danger. Is the monkey who typed Hamlet actually a good writer? Choosing a model is central to all statistical work with data. We have seen rapid advances in model fitting and in the theoretical understanding of model selection, yet this book is the first to synthesize research and practice from this active field. Model choice criteria are explained, discussed and compared, including the AIC, BIC, DIC and FIC. The uncertainties involved with model selection are tackled, with discussions of frequentist and Bayesian methods; model averaging schemes are presented. Real-data examples are complemented by derivations providing deeper insight into the methodology, and instructive exercises build familiarity with the methods. The companion website features Data sets and R code.
This focuses on models and data that arise from repeated observations of a cross-section of individuals, households or companies. These models have found important applications within business, economics, education, political science and other social science disciplines. The author introduces the foundations of longitudinal and panel data analysis at a level suitable for quantitatively oriented graduate social science students as well as individual researchers. He emphasizes mathematical and statistical fundamentals but also describes substantive applications from across the social sciences, showing the breadth and scope that these models enjoy. The applications are enhanced by real-world data sets and software programs in SAS and Stata.
Data Analysis Using Regression and Multilevel/Hierarchical Models, first published in 2007, is a comprehensive manual for the applied researcher who wants to perform data analysis using linear and nonlinear regression and multilevel models. The book introduces a wide variety of models, whilst at the same time instructing the reader in how to fit these models using available software packages. The book illustrates the concepts by working through scores of real data examples that have arisen from the authors' own applied research, with programming codes provided for each one. Topics covered include causal inference, including regression, poststratification, matching, regression discontinuity, and instrumental variables, as well as multilevel logistic regression and missing-data imputation. Practical tips regarding building, fitting, and understanding are provided throughout.
Many electronic and acoustic signals can be modelled as sums of sinusoids and noise. However, the amplitudes, phases and frequencies of the sinusoids are often unknown and must be estimated in order to characterise the periodicity or near-periodicity of a signal and consequently to identify its source. This book presents and analyses several practical techniques used for such estimation. The problem of tracking slow frequency changes over time of a very noisy sinusoid is also considered. Rigorous analyses are presented via asymptotic or large sample theory, together with physical insight. The book focuses on achieving extremely accurate estimates when the signal to noise ratio is low but the sample size is large. Each chapter begins with a detailed overview, and many applications are given. Matlab code for the estimation techniques is also included. The book will thus serve as an excellent introduction and reference for researchers analysing such signals.
At last - a book devoted to the negative binomial model and its many variations. Every model currently offered in commercial statistical software packages is discussed in detail - how each is derived, how each resolves a distributional problem, and numerous examples of their application. Many have never before been thoroughly examined in a text on count response models: the canonical negative binomial; the NB-P model, where the negative binomial exponent is itself parameterized; and negative binomial mixed models. As the models address violations of the distributional assumptions of the basic Poisson model, identifying and handling overdispersion is a unifying theme. For practising researchers and statisticians who need to update their knowledge of Poisson and negative binomial models, the book provides a comprehensive overview of estimating methods and algorithms used to model counts, as well as specific guidelines on modeling strategy and how each model can be analyzed to access goodness-of-fit.
The study of spatial processes and their applications is an important topic in statistics and finds wide application particularly in computer vision and image processing. This book is devoted to statistical inference in spatial statistics and is intended for specialists needing an introduction to the subject and to its applications. One of the themes of the book is the demonstration of how these techniques give new insights into classical procedures (including new examples in likelihood theory) and newer statistical paradigms such as Monte-Carlo inference and pseudo-likelihood. Professor Ripley also stresses the importance of edge effects and of lack of a unique asymptotic setting in spatial problems. Throughout, the author discusses the foundational issues posed and the difficulties, both computational and philosophical, which arise. The final chapters consider image restoration and segmentation methods and the averaging and summarising of images. Thus, the book will find wide appeal to researchers in computer vision, image processing, and those applying microscopy in biology, geology and materials science, as well as to statisticians interested in the foundations of their discipline.
Matrix Algebra is the first volume of the Econometric Exercises Series. It contains exercises relating to course material in matrix algebra that students are expected to know while enrolled in an (advanced) undergraduate or a postgraduate course in econometrics or statistics. The book contains a comprehensive collection of exercises, all with full answers. But the book is not just a collection of exercises; in fact, it is a textbook, though one that is organized in a completely different manner than the usual textbook. The volume can be used either as a self-contained course in matrix algebra or as a supplementary text.
This book introduces basic and advanced concepts of categorical regression with a focus on the structuring constituents of regression, including regularization techniques to structure predictors. In addition to standard methods such as the logit and probit model and extensions to multivariate settings, the author presents more recent developments in flexible and high-dimensional regression, which allow weakening of assumptions on the structuring of the predictor and yield fits that are closer to the data. A generalized linear model is used as a unifying framework whenever possible in particular parametric models that are treated within this framework. Many topics not normally included in books on categorical data analysis are treated here, such as nonparametric regression; selection of predictors by regularized estimation procedures; ternative models like the hurdle model and zero-inflated regression models for count data; and non-standard tree-based ensemble methods. The book is accompanied by an R package that contains data sets and code for all the examples.
This book is an introduction to the field of asymptotic statistics. The treatment is both practical and mathematically rigorous. In addition to most of the standard topics of an asymptotics course, including likelihood inference, M-estimation, the theory of asymptotic efficiency, U-statistics, and rank procedures, the book also presents recent research topics such as semiparametric models, the bootstrap, and empirical processes and their applications. The topics are organized from the central idea of approximation by limit experiments, which gives the book one of its unifying themes. This entails mainly the local approximation of the classical i.i.d. set up with smooth parameters by location experiments involving a single, normally distributed observation. Thus, even the standard subjects of asymptotic statistics are presented in a novel way. Suitable as a graduate or Master's level statistics text, this book will also give researchers an overview of research in asymptotic statistics.
David A. Freedman presents here a definitive synthesis of his approach to causal inference in the social sciences. He explores the foundations and limitations of statistical modeling, illustrating basic arguments with examples from political science, public policy, law, and epidemiology. Freedman maintains that many new technical approaches to statistical modeling constitute not progress, but regress. Instead, he advocates a 'shoe leather' methodology, which exploits natural variation to mitigate confounding and relies on intimate knowledge of the subject matter to develop meticulous research designs and eliminate rival explanations. When Freedman first enunciated this position, he was met with scepticism, in part because it was hard to believe that a mathematical statistician of his stature would favor 'low-tech' approaches. But the tide is turning. Many social scientists now agree that statistical technique cannot substitute for good research design and subject matter knowledge. This book offers an integrated presentation of Freedman's views.
This lively and engaging book explains the things you have to know in order to read empirical papers in the social and health sciences, as well as the techniques you need to build statistical models of your own. The discussion in the book is organized around published studies, as are many of the exercises. Relevant journal articles are reprinted at the back of the book. Freedman makes a thorough appraisal of the statistical methods in these papers and in a variety of other examples. He illustrates the principles of modelling, and the pitfalls. The discussion shows you how to think about the critical issues - including the connection (or lack of it) between the statistical models and the real phenomena. The book is written for advanced undergraduates and beginning graduate students in statistics, as well as students and professionals in the social and health sciences.
It is increasingly common for analysts to seek out the opinions of individuals and organizations using attitudinal scales such as degree of satisfaction or importance attached to an issue. Examples include levels of obesity, seriousness of a health condition, attitudes towards service levels, opinions on products, voting intentions, and the degree of clarity of contracts. Ordered choice models provide a relevant methodology for capturing the sources of influence that explain the choice made amongst a set of ordered alternatives. The methods have evolved to a level of sophistication that can allow for heterogeneity in the threshold parameters, in the explanatory variables (through random parameters), and in the decomposition of the residual variance. This book brings together contributions in ordered choice modeling from a number of disciplines, synthesizing developments over the last fifty years, and suggests useful extensions to account for the wide range of sources of influence on choice.
This text gives budding actuaries and financial analysts a foundation in multiple regression and time series. They will learn about these statistical techniques using data on the demand for insurance, lottery sales, foreign exchange rates, and other applications. Although no specific knowledge of risk management or finance is presumed, the approach introduces applications in which statistical techniques can be used to analyze real data of interest. In addition to the fundamentals, this book describes several advanced statistical topics that are particularly relevant to actuarial and financial practice, including the analysis of longitudinal, two-part (frequency/severity), and fat-tailed data. Datasets with detailed descriptions, sample statistical software scripts in 'R' and 'SAS', and tips on writing a statistical report, including sample projects, can be found on the book's Web site: http://research.bus.wisc.edu/RegActuaries.
This lively and engaging textbook provides the knowledge required to read empirical papers in the social and health sciences, as well as the techniques needed to build statistical models. The author explains the basic ideas of association and regression, and describes the current models that link these ideas to causality. He focuses on applications of linear models, including generalized least squares and two-stage least squares. The bootstrap is developed as a technique for estimating bias and computing standard errors. Careful attention is paid to the principles of statistical inference. There is background material on study design, bivariate regression, and matrix algebra. To develop technique, there are computer labs, with sample computer programs. The book's discussion is organized around published studies, as are the numerous exercises - many of which have answers included. Relevant papers reprinted at the back of the book are thoroughly appraised by the author.
This second edition of Hilbe's Negative Binomial Regression is a substantial enhancement to the popular first edition. The only text devoted entirely to the negative binomial model and its many variations, nearly every model discussed in the literature is addressed. The theoretical and distributional background of each model is discussed, together with examples of their construction, application, interpretation and evaluation. Complete Stata and R codes are provided throughout the text, with additional code (plus SAS), derivations and data provided on the book's website. Written for the practising researcher, the text begins with an examination of risk and rate ratios, and of the estimating algorithms used to model count data. The book then gives an in-depth analysis of Poisson regression and an evaluation of the meaning and nature of overdispersion, followed by a comprehensive analysis of the negative binomial distribution and of its parameterizations into various models for evaluating count data.
Statistics do not lie, nor is probability paradoxical. You just have to have the right intuition. In this lively look at both subjects, David Williams convinces mathematics students of the intrinsic interest of statistics and probability, and statistics students that the language of mathematics can bring real insight and clarity to their subject. He helps students build the intuition needed, in a presentation enriched with examples drawn from all manner of applications, e.g., genetics, filtering, the Black–Scholes option-pricing formula, quantum probability and computing, and classical and modern statistical models. Statistics chapters present both the Frequentist and Bayesian approaches, emphasising Confidence Intervals rather than Hypothesis Test, and include Gibbs-sampling techniques for the practical implementation of Bayesian methods. A central chapter gives the theory of Linear Regression and ANOVA, and explains how MCMC methods allow greater flexibility in modelling. C or WinBUGS code is provided for computational examples and simulations. Many exercises are included; hints or solutions are often provided.
Matched sampling is often used to help assess the causal effect of some exposure or intervention, typically when randomized experiments are not available or cannot be conducted. This book presents a selection of Donald B. Rubin's research articles on matched sampling, from the early 1970s, when the author was one of the major researchers involved in establishing the field, to recent contributions to this now extremely active area. The articles include fundamental theoretical studies that have become classics, important extensions, and real applications that range from breast cancer treatments to tobacco litigation to studies of criminal tendencies. They are organized into seven parts, each with an introduction by the author that provides historical and personal context and discusses the relevance of the work today. A concluding essay offers advice to investigators designing observational studies. The book provides an accessible introduction to the study of matched sampling and will be an indispensable reference for students and researchers.
Exact statistical inference may be employed in diverse fields of science and technology. As problems become more complex and sample sizes become larger, mathematical and computational difficulties can arise that require the use of approximate statistical methods. Such methods are justified by asymptotic arguments but are still based on the concepts and principles that underlie exact statistical inference. With this in perspective, this book presents a broad view of exact statistical inference and the development of asymptotic statistical inference, providing a justification for the use of asymptotic methods for large samples. Methodological results are developed on a concrete and yet rigorous mathematical level and are applied to a variety of problems that include categorical data, regression, and survival analyses. This book is designed as a textbook for advanced undergraduate or beginning graduate students in statistics, biostatistics, or applied statistics but may also be used as a reference for academic researchers.
This book introduces in a systematic manner a general nonparametric theory of statistics on manifolds, with emphasis on manifolds of shapes. The theory has important and varied applications in medical diagnostics, image analysis, and machine vision. An early chapter of examples establishes the effectiveness of the new methods and demonstrates how they outperform their parametric counterparts. Inference is developed for both intrinsic and extrinsic Fréchet means of probability distributions on manifolds, then applied to shape spaces defined as orbits of landmarks under a Lie group of transformations - in particular, similarity, reflection similarity, affine and projective transformations. In addition, nonparametric Bayesian theory is adapted and extended to manifolds for the purposes of density estimation, regression and classification. Ideal for statisticians who analyze manifold data and wish to develop their own methodology, this book is also of interest to probabilists, mathematicians, computer scientists, and morphometricians with mathematical training.