To save content items to your account,
please confirm that you agree to abide by our usage policies.
If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account.
Find out more about saving content to .
To save content items to your Kindle, first ensure no-reply@cambridge.org
is added to your Approved Personal Document E-mail List under your Personal Document Settings
on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part
of your Kindle email address below.
Find out more about saving to your Kindle.
Note you can select to save to either the @free.kindle.com or @kindle.com variations.
‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi.
‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.
This Element introduces the basics of Bayesian regression modeling using modern computational tools. This Element only assumes that the reader has taken a basic statistics course and has seen Bayesian inference at the introductory level of Gill and Bao (2024). Some matrix algebra knowledge is assumed but the authors walk carefully through the necessary structures at the start of this Element. At the end of the process readers will fully understand how Bayesian regression models are developed and estimated, including linear and nonlinear versions. The sections cover theoretical principles and real-world applications in order to provide motivation and intuition. Because Bayesian methods are intricately tied to software, code in R and Python is provided throughout.
The accumulation of empirical evidence that has been collected in multiple contexts, places, and times requires a more comprehensive understanding of empirical research than is typically required for interpreting the findings from individual studies. We advance a novel conceptual framework where causal mechanisms are central to characterizing social phenomena that transcend context, place, or time. We distinguish various concepts of external validity, all of which characterize the relationship between the effects produced by mechanisms in different settings. Approaches to evidence accumulation require careful consideration of cross-study features, including theoretical considerations that link constituent studies and measurement considerations about how phenomena are quantifed. Our main theoretical contribution is developing uniting principles that constitute the qualitative and quantitative assumptions that form the basis for a quantitative relationship between constituent studies. We then apply our framework to three approaches to studying general social phenomena: meta-analysis, replication, and extrapolation.
In this Element, the authors introduce Bayesian probability and inference for social science students and practitioners starting from the absolute beginning and walk readers steadily through the Element. No previous knowledge is required other than that in a basic statistics course. At the end of the process, readers will understand the core tenets of Bayesian theory and practice in a way that enables them to specify, implement, and understand models using practical social science data. Chapters will cover theoretical principles and real-world applications that provide motivation and intuition. Because Bayesian methods are intricately tied to software, code in both R and Python is provided throughout.
In this Element, which continues our discussion in Foundations, the authors provide an accessible and practical guide for the analysis and interpretation of Regression Discontinuity (RD) designs that encourages the use of a common set of practices and facilitates the accumulation of RD-based empirical evidence. The focus is on extensions to the canonical sharp RD setup that we discussed in Foundations. The discussion covers (i) the local randomization framework for RD analysis, (ii) the fuzzy RD design where compliance with treatment is imperfect, (iii) RD designs with discrete scores, and (iv) and multi-dimensional RD designs.
The goal of this Element is to provide a detailed introduction to adaptive inventories, an approach to making surveys adjust to respondents' answers dynamically. This method can help survey researchers measure important latent traits or attitudes accurately while minimizing the number of questions respondents must answer. The Element provides both a theoretical overview of the method and a suite of tools and tricks for integrating it into the normal survey process. It also provides practical advice and direction on how to calibrate, evaluate, and field adaptive batteries using example batteries that measure variety of latent traits of interest to survey researchers across the social sciences.
Quantitative social scientists use survival analysis to understand the forces that determine the duration of events. This Element provides a guideline to new techniques and models in survival analysis, particularly in three areas: non-proportional covariate effects, competing risks, and multi-state models. It also revisits models for repeated events. The Element promotes multi-state models as a unified framework for survival analysis and highlights the role of general transition probabilities as key quantities of interest that complement traditional hazard analysis. These quantities focus on the long term probabilities that units will occupy particular states conditional on their current state, and they are central in the design and implementation of policy interventions.
In discrete choice models the relationships between the independent variables and the choice probabilities are nonlinear, depending on both the value of the particular independent variable being interpreted and the values of the other independent variables. Thus, interpreting the magnitude of the effects (the “substantive effects”) of the independent variables on choice behavior requires the use of additional interpretative techniques. Three common techniques for interpretation are described here: first differences, marginal effects and elasticities, and odds ratios. Concepts related to these techniques are also discussed, as well as methods to account for estimation uncertainty. Interpretation of binary logits, ordered logits, multinomial and conditional logits, and mixed discrete choice models such as mixed multinomial logits and random effects logits for panel data are covered in detail. The techniques discussed here are general, and can be applied to other models with discrete dependent variables which are not specifically described here.
Text contains a wealth of information about about a wide variety of sociocultural constructs. Automated prediction methods can infer these quantities (sentiment analysis is probably the most well-known application). However, there is virtually no limit to the kinds of things we can predict from text: power, trust, misogyny, are all signaled in language. These algorithms easily scale to corpus sizes infeasible for manual analysis. Prediction algorithms have become steadily more powerful, especially with the advent of neural network methods. However, applying these techniques usually requires profound programming knowledge and machine learning expertise. As a result, many social scientists do not apply them. This Element provides the working social scientist with an overview of the most common methods for text classification, an intuition of their applicability, and Python code to execute them. It covers both the ethical foundations of such work as well as the emerging potential of neural network methods.
Data are not only ubiquitous in society, but are increasingly complex both in size and dimensionality. Dimension reduction offers researchers and scholars the ability to make such complex, high dimensional data spaces simpler and more manageable. This Element offers readers a suite of modern unsupervised dimension reduction techniques along with hundreds of lines of R code, to efficiently represent the original high dimensional data space in a simplified, lower dimensional subspace. Launching from the earliest dimension reduction technique principal components analysis and using real social science data, I introduce and walk readers through application of the following techniques: locally linear embedding, t-distributed stochastic neighbor embedding (t-SNE), uniform manifold approximation and projection, self-organizing maps, and deep autoencoders. The result is a well-stocked toolbox of unsupervised algorithms for tackling the complexities of high dimensional data so common in modern society. All code is publicly accessible on Github.
This Element discusses how shiny, an R package, can help instructors teach quantitative methods more effectively by way of interactive web apps. The interactivity increases instructors' effectiveness by making students more active participants in the learning process, allowing them to engage with otherwise complex material in an accessible, dynamic way. The Element offers four detailed apps that cover two fundamental linear regression topics: estimation methods (least squares, maximum likelihood) and the classic linear regression assumptions. It includes a summary of what the apps can be used to demonstrate, detailed descriptions of the apps' full capabilities, vignettes from actual class use, and example activities. Two other apps pertain to a more advanced topic (LASSO), with similar supporting material. For instructors interested in modifying the apps, the Element also documents the main apps' general code structure, highlights some of the more likely modifications, and goes through what functions need to be amended.
In the age of data-driven problem-solving, applying sophisticated computational tools for explaining substantive phenomena is a valuable skill. Yet, application of methods assumes an understanding of the data, structure, and patterns that influence the broader research program. This Element offers researchers and teachers an introduction to clustering, which is a prominent class of unsupervised machine learning for exploring and understanding latent, non-random structure in data. A suite of widely used clustering techniques is covered in this Element, in addition to R code and real data to facilitate interaction with the concepts. Upon setting the stage for clustering, the following algorithms are detailed: agglomerative hierarchical clustering, k-means clustering, Gaussian mixture models, and at a higher-level, fuzzy C-means clustering, DBSCAN, and partitioning around medoids (k-medoids) clustering.
Text is everywhere, and it is a fantastic resource for social scientists. However, because it is so abundant, and because language is so variable, it is often difficult to extract the information we want. There is a whole subfield of AI concerned with text analysis (natural language processing). Many of the basic analysis methods developed are now readily available as Python implementations. This Element will teach you when to use which method, the mathematical background of how it works, and the Python code to implement it.
We elaborate a general workflow of weighting-based survey inference, decomposing it into two main tasks. The first is the estimation of population targets from one or more sources of auxiliary information. The second is the construction of weights that calibrate the survey sample to the population targets. We emphasize that these tasks are predicated on models of the measurement, sampling, and nonresponse process whose assumptions cannot be fully tested. After describing this workflow in abstract terms, we then describe in detail how it can be applied to the analysis of historical and contemporary opinion polls. We also discuss extensions of the basic workflow, particularly inference for causal quantities and multilevel regression and poststratification.
Images play a crucial role in shaping and reflecting political life. Digitization has vastly increased the presence of such images in daily life, creating valuable new research opportunities for social scientists. We show how recent innovations in computer vision methods can substantially lower the costs of using images as data. We introduce readers to the deep learning algorithms commonly used for object recognition, facial recognition, and visual sentiment analysis. We then provide guidance and specific instructions for scholars interested in using these methods in their own research.
Building on the Cambridge Element Agent Based Models of Social Life: Fundamentals (Cambridge, 2020), we move on to the next level. We do this by building agent based models of polarization and ethnocentrism. In the process, we develop: stochastic models, which add a crucial element of uncertainty to human interaction; models of human interactions structured by social networks; and 'evolutionary' models in which agents using more effective decision rules are more likely to survive and prosper than others. The aim is to leave readers with an effective toolkit for building, running and analyzing agent based modes of social interaction.
Social interactions are rich, complex, and dynamic. One way to understand these is to model interactions that fascinate us. Some of the more realistic and powerful models are computer simulations. Simple, elegant and powerful, tools are available in user-friendly free software to help you design, build and run your own models of social interactions that intrigue you, and do this on the most basic laptop computer. Focusing on a well-known model of housing segregation, this Element is about how to unleash that power, setting out the fundamentals of what is now known as 'agent based modeling'.
In this Element and its accompanying second Element, A Practical Introduction to Regression Discontinuity Designs: Extensions, Matias Cattaneo, Nicolás Idrobo, and Rocıìo Titiunik provide an accessible and practical guide for the analysis and interpretation of regression discontinuity (RD) designs that encourages the use of a common set of practices and facilitates the accumulation of RD-based empirical evidence. In this Element, the authors discuss the foundations of the canonical Sharp RD design, which has the following features: (i) the score is continuously distributed and has only one dimension, (ii) there is only one cutoff, and (iii) compliance with the treatment assignment is perfect. In the second Element, the authors discuss practical and conceptual extensions to this basic RD setup.
The rise of the internet and mobile telecommunications has created the possibility of using large datasets to understand behavior at unprecedented levels of temporal and geographic resolution. Online social networks attract the most users, though users of these new technologies provide their data through multiple sources, e.g. call detail records, blog posts, web forums, and content aggregation sites. These data allow scholars to adjudicate between competing theories as well as develop new ones, much as the microscope facilitated the development of the germ theory of disease. Of those networks, Twitter presents an ideal combination of size, international reach, and data accessibility that make it the preferred platform in academic studies. Acquiring, cleaning, and analyzing these data, however, require new tools and processes. This Element introduces these methods to social scientists and provides scripts and examples for downloading, processing, and analyzing Twitter data.
Recommend this
Email your librarian or administrator to recommend adding this to your organisation's collection.