To save content items to your account,
please confirm that you agree to abide by our usage policies.
If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account.
Find out more about saving content to .
To save content items to your Kindle, first ensure no-reply@cambridge.org
is added to your Approved Personal Document E-mail List under your Personal Document Settings
on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part
of your Kindle email address below.
Find out more about saving to your Kindle.
Note you can select to save to either the @free.kindle.com or @kindle.com variations.
‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi.
‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.
Many professions (e.g., teachers, pilots, air traffic controllers, physicians) require applicants to pass a licensing exam, the principal purpose of which is to protect the public from incompetent practitioners. These exams also sometimes show the same sorts of race and sex differences observed in other test scores. Thus, they too are susceptible to equity criticisms. We discuss the implications of getting rid of such tests or even just lowering cutoff scores. Medical licensing has been around for over 1,000 years. The U.S. did not start licensing physicians until the late 1800s. Early exams were oral, subject to criticisms about objectivity, and resulted in disaster in West Virginia. Ultimately, the National Board of Medical Examiners was formed and multiple-choice exams replaced essay exams on the United State Medical License Exam (USMLE). To get into medical school, undergraduates must take the Medical College Admission Test (MCAT). Like other tests, the MCAT reveals race and sex differences. The same is true for tests to license pilots and air traffic controllers. K-12 teacher licensing formally began with the National Teacher Examination (NTE) in 1940.
How far have we come? What strategies will most likely aid in achieving our goals? What evidence must be gathered to go further? We have focused in the book on how tests provide valuable information when making decisions about who to admit, who to hire, who to license, who to award scholarships to, and so on. Given limited resources, efficiency in selection should be essential. However, tests used for these purposes also reveal race and sex differences that conflict with society’s desire for fairness. How do we make policies and decisions so as to maximize efficiency while also minimizing adverse impact? There is no statistical solution to this problem. We suggest an approach that will get us closer to an acceptable solution than where we currently stand. A first step is to gather all relevant data so that any selection policy can be evaluated as to both kinds of errors. Second, make such data publicly available so that all interested parties can have access and everything is transparent. As mentioned previously, numerous times such data are not made available due to a fear of criticism. Third, causal connections between policies and outcomes should be established. Finally, if considerations other than merit are important, those arguments should be made public and modifications examined to measure the impact of policy adjustments.
We trace the origins of testing to its civil service roots in Xia Dynasty China 4,000 years ago, to the Middle East in Biblical times, and to the monumental changes in psychometrics in latter half of the twentieth century. The early twentieth century witnessed the birth of the multiple-choice test and a focus on measuring cognitive ability rather than knowledge of content – influenced greatly by IQ and US Army placement testing. Multiple-choice tests provided an objectivity in scoring that had previously eluded the standard essays used in college entrance exams. The field of testing began to take notice of measurement errors and strove to minimize them. Computerized Adaptive Tests (CAT) were developed to accurately measure a person’s ability with the fewest number of items. The future advancement of testing is dependent on a continued process of experimentation to determine what improves and what does not.
Armed services tests have existed for centuries. We focus on the US Armed Services and how the tests used have adapted to changed claims associated with changing needs and purposes of the tests. World War I provided the impetus for the first serious military testing program. An all-star group of psychologists convened in Vineland, New Jersey and quickly constructed Army Alpha, which became a model for later group-administered, objective, multiple-choice tests. Military testing was the first program to explicitly move from very specialized tests for specific purposes to testing generalized underlying ability. This made such tests suitable for situations not even considered initially. The practice was both widely followed and just as widely disparaged. The AGCT, AFQT, and ASVAB were later versions of this initial test. Army Alpha also influenced the creation of the SAT, ACT, GRE, LSAT, and MCAT tests. Decisions based on military tests, like all tests, can be controversial. In 1965, Project 100,000 lowered the cut score and resulted in thousands of low-scoring men being drafted, many of whom later died fighting in Vietnam.
In both K-12 and higher education, it is common to use test scores in deciding which students receive scholarships and other awards. As with placement decisions, this practice is also controversial due to issues of equity. We discuss the evidence supporting test scores as an aid in making such decisions including the costs of finding suitable winners, the costs of false positives, and the costs of false negatives. The G.I. Bill and United Negro College Fund provided scholarships for soldiers returning from World War II. Athletic scholarships have been around since 1952 and show an incredible disparity by race, favoring Black students. In 1955, the National Merit Scholarship (NMS) program was created to support students. The award is not much financially (~$2,500) but other sources of support usually follow students who score high enough to warrant merit. Like other tests, the PSAT/NMSQT shows race differences. States have addressed this differently, with some ranking students by district or school rather by state, resulting in more minorities receiving awards. Evidence suggests that such rankings within schools rather than statewide result in students with lower scores receiving awards but not doing as well academically as others who score higher and yet do not receive awards. Issues of fairness in testing remain.
National digital ID apps are increasingly gaining popularity globally. As how we transact in the world is increasingly mediated by the digital, questions need to be asked about how these apps support the inclusion of disabled people. In particular, international instruments, such as the United Nations Convention on the Rights of Persons with Disabilities, spotlight the need for inclusive information and communication technologies. In this paper, we adopt a critical disability studies lens to analyse the workings of state-designed digital IDs—Singpass app—and what they can tell us about existing ways of designing for digital inclusion. We situate the case of the Singpass app within the rise of global digital transactions and the political-technical infrastructures that shape their accessibility. We analyse the ways Singpass centres disability, the problems it may still entail, and the possible implications for inclusion. At the same time, we uncover the lessons Singpass’s development holds for questions of global digital inclusion.
As digital welfare systems expand in local governments worldwide, understanding their implications is crucial for safeguarding public values like transparency, legitimacy, accountability, and privacy. A lack of political debate on data-driven technologies risks eroding democratic legitimacy by obscuring decision-making and impeding accountability mechanisms. In the Netherlands, political discussions on digital welfare within local governments are surprisingly limited, despite evidence of negative impacts on both frontline professionals and citizens. This study examines what mechanisms explain if and how data-driven technologies in the domain of work and income are politically discussed within the municipal government of a large city in the Netherlands, and its consequences. Using a sequential mixed methods design, combining automated text-analysis software ConText (1.2.0) and text-analysis software Atlas.ti (9), we analyzed documents and video recordings of municipal council and committee meetings from 2016 to 2023. Our results show these discussions are rare in the municipal council, occurring primarily either in reaction to scandals, or in reaction to criticism. Two key discursive factors used to justify limited political discussion are: (1) claims of lacking time and knowledge among council members and aldermen, and (2) distancing responsibility and diffusing accountability. This leads to a ‘content chopping’ mechanism, where issues are chopped into small content pieces, for example technical, ethical, and political aspects, and spreading them into separate documents and discussion arenas. This fragmentation can obscure overall coherence and diffuse critical concerns, potentially leading to harmful effects like dehumanization and stereotyping.
The Pósa–Seymour conjecture determines the minimum degree threshold for forcing the $k$th power of a Hamilton cycle in a graph. After numerous partial results, Komlós, Sárközy, and Szemerédi proved the conjecture for sufficiently large graphs. In this paper, we focus on the analogous problem for digraphs and for oriented graphs. We asymptotically determine the minimum total degree threshold for forcing the square of a Hamilton cycle in a digraph. We also give a conjecture on the corresponding threshold for $k$th powers of a Hamilton cycle more generally. For oriented graphs, we provide a minimum semi-degree condition that forces the $k$th power of a Hamilton cycle; although this minimum semi-degree condition is not tight, it does provide the correct order of magnitude of the threshold. Turán-type problems for oriented graphs are also discussed.
For $\ell \geq 3$, an $\ell$-uniform hypergraph is disperse if the number of edges induced by any set of $\ell +1$ vertices is 0, 1, $\ell$, or $\ell +1$. We show that every disperse $\ell$-uniform hypergraph on $n$ vertices contains a clique or independent set of size $n^{\Omega _{\ell }(1)}$, answering a question of the first author and Tomon. To this end, we prove several structural properties of disperse hypergraphs.
We investigate the properties of the linear two-way fixed effects (FE) estimator for panel data when the underlying data generating process (DGP) does not have a linear parametric structure. The FE estimator is consistent for some pseudo-true value and we characterize the corresponding asymptotic distribution. We show that the rate of convergence is determined by the degree of model misspecification, and that the asymptotic distribution can be non-normal. We propose a novel autoregressive double adaptive wild (AdaWild) bootstrap procedure applicable for a large class of DGPs. Monte Carlo simulations show that it performs well for panels of small and moderate dimensions. We use data from U.S. manufacturing industries to illustrate the benefits of our procedure.
We describe the asymptotic behaviour of large degrees in random hyperbolic graphs for all values of the curvature parameter $\alpha$. We prove that, with high probability, the node degrees satisfy the following ordering property: the ranking of the nodes by decreasing degree coincides with the ranking of the nodes by increasing distance to the centre, at least up to any constant rank. In the sparse regime $\alpha>\tfrac{1}{2}$, the rank at which these two rankings cease to coincide is $n^{1/(1+8\alpha)+o(1)}$. We also provide a quantitative description of the large degrees by proving the convergence in distribution of the normalised degree process towards a Poisson point process. In particular, this establishes the convergence in distribution of the normalised maximum degree of the graph. A transition occurs at $\alpha = \tfrac{1}{2}$, which corresponds to the connectivity threshold of the model. For $\alpha < \tfrac{1}{2}$, the maximum degree is of order $n - O(n^{\alpha + 1/2})$, whereas for $\alpha \geq \tfrac{1}{2}$, the maximum degree is of order $n^{1/(2\alpha)}$. In the $\alpha < \tfrac{1}{2}$ and $\alpha > \tfrac{1}{2}$ cases, the limit distribution of the maximum degree belongs to the class of extreme value distributions (Weibull for $\alpha < \tfrac{1}{2}$ and Fréchet for $\alpha > \tfrac{1}{2}$). This refines previous estimates on the maximum degree for $\alpha > \tfrac{1}{2}$ and extends the study of large degrees to the dense regime $\alpha \leq \tfrac{1}{2}$.
Trypanosoma cruzi, the etiological agent of Chagas disease, is a vector-borne parasite traditionally associated with sylvatic environments. We investigated the prevalence of T. cruzi in triatomines collected from El Paso County, Texas, and southern New Mexico. Specimens were morphologically identified as Triatoma rubida and subjected to quantitative PCR for parasite detection. Molecular sequencing of satellite and microsatellite DNA targets was performed to confirm species identity and assess strain lineage. Infected vectors were collected from both sylvatic and urban locations, including Franklin Mountains State Park and residential areas in El Paso (TX) and Las Cruces (NM). Of the 26 triatomines tested, 88.5% were positive for T. cruzi, representing a significant increase compared to a previous regional study, which reported an infection rate of 63.3%. The high prevalence of T. cruzi-infected T. rubida, particularly in urban and peri-urban areas of El Paso and Las Cruces, underscores the increasing public health significance of Chagas disease along the U.S.–Mexico border. These findings highlight the urgent need for sustained vector surveillance, advanced molecular characterization, and focused public health interventions to reduce transmission risks and raise clinical awareness in affected regions.
This study investigates unintended information flow in large language models (LLMs) by proposing a computational linguistic framework for detecting and analyzing domain anchorage. Domain anchorage is a phenomenon potentially caused by in-context learning or latent “cache” retention of prior inputs, which enables language models to infer and reinforce shared latent concepts across interactions, leading to uniformity in responses that can persist across distinct users or prompts. Using GPT-4 as a case study, our framework systematically quantifies the lexical, syntactic, semantic, and positional similarities between inputs and outputs to detect these domain anchorage effects. We introduce a structured methodology to evaluate the associated risks and highlight the need for robust mitigation strategies. By leveraging domain-aware analysis, this work provides a scalable framework for monitoring information persistence in LLMs, which can inform enterprise guardrails to ensure response consistency, privacy, and safety in real-world deployments.
We contribute to the recent debate on the instability of the slope of the Phillips curve by offering insights from a flexible time-varying instrumental variable (IV) approach robust to weak instruments. Our robust approach focuses directly on the Phillips curve and allows general forms of instability, in contrast to current approaches based either on structural models with time-varying parameters or IV estimates in ad-hoc sub-samples. We find evidence of a weakening of the slope of the Phillips curve starting around 1980. We also offer novel insights on the Phillips curve during the recent pandemic: The flattening has reverted and the Phillips curve is back.
Social relationships provide opportunities to exchange and obtain health advice. Not only close confidants may be perceived as sources of health advice, but also acquaintances met in places outside a closed circle of family and friends, e.g., in voluntary organizations. This study is the first to analyze the structure of complete health advice networks in three voluntary organizations and compare them with more commonly studied close relationships. To this end, we collected data on multiple networks and health outcomes among 143 middle-aged and older adults (mean age = 53.9 years) in three carnival clubs in Germany. Our analyses demonstrate that perceived health advice and close relationships overlap only by 34%. Moreover, recent advances in exponential random graph models (ERGMs) allow us to illustrate that the network structure of perceived health advice differs starkly from that of close relationships. For instance, we found that advice networks exhibited lower transitivity and greater segregation by gender and age in comparison to networks of close relationships. We also found that actors with poor physical health perceive less individuals as health advisors than those with good physical health. Our findings suggest that community settings, such as voluntary associations, provide a unique platform for exchanging health advice and information among both close and distant network members.
Choose the type of multivariable model based on the type of outcome variable you have. Perform univariate statistics to understand the distribution of your independent and outcome variables. Perform bivariate analysis of your independent variables. Run a correlation matrix to understand how your independent variables are related to another. Assess your missing data. Perform your analysis and assess how well your model fits the data. Assess the strength of your individual covariates in estimating outcome. Use regression diagnostics to assess the underlying assumptions of your model. Perform sensitivity analyses to assess the robustness of your findings and consider whether it would be possible to validate your model. Publish your work and soak up the glory.
Performing the analysis: a series of designed steps ensuring you are entering the correct information into the model. A useful convention to code dichotomous variables: assign “1” to presence and “0” to absence of condition; the variable’s mean will be equal to the condition’s prevalence. The reference group choice will not affect results but will affect how results are reported. Choose your reference category based on the main hypothesis. Interaction terms are entered by creating a product term: a variable whose value is the product of two independent variables. For proportional hazards or other survival time, enter: starting time, outcome of interest’s date, censor date.
Variable selection techniques: automatic procedures determining which independent variables will be included in a model. It is usually better for the investigator to decide what variables should be in the model rather than using a statistical algorithm.
Working with a biostatistician should be an iterative process. Especially with complicated studies, consult them at each analysis phase. For conducting the analysis, use the stat package your research group uses so you will always be able to get help when needed.