To save content items to your account,
please confirm that you agree to abide by our usage policies.
If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account.
Find out more about saving content to .
To save content items to your Kindle, first ensure no-reply@cambridge.org
is added to your Approved Personal Document E-mail List under your Personal Document Settings
on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part
of your Kindle email address below.
Find out more about saving to your Kindle.
Note you can select to save to either the @free.kindle.com or @kindle.com variations.
‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi.
‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.
Efforts to curb online hate speech depend on our ability to reliably detect it at scale. Previous studies have highlighted the strong zero-shot classification performance of large language models (LLMs), offering a potential tool to efficiently identify harmful content. Yet for complex and ambivalent tasks like hate speech detection, pre-trained LLMs can be insufficient and carry systemic biases. Domain-specific models fine-tuned for the given task and empirical context could help address these issues, but, as we demonstrate, the quality of data used for fine-tuning decisively matters. In this study, we fine-tuned GPT-4o-mini using a unique corpus of online comments annotated by diverse groups of coders with varying annotation quality: research assistants, activists, two kinds of crowd workers, and citizen scientists. We find that only annotations from those groups of annotators that are better than zero-shot GPT-4o-mini in recognizing hate speech improve the classification performance of the fine-tuned LLM. Specifically, fine-tuning using the highest-quality annotator group – trained research assistants – boosts classification performance by increasing the model’s precision without notably sacrificing the good recall of zero-shot GPT-4o-mini. In contrast, lower-quality annotations do not improve and may even decrease the ability to identify hate speech. By examining tasks reliant on human judgment and context, we offer insights that go beyond hate speech detection.
An ever-increasing availability of digital texts has opened new research opportunities for political scientists. Yet, researchers who want to utilise these data face several challenges. This paper presents the results of a community-wide survey tapping into various research challenges, training needs, and preferences of scholars using text analysis methodologies. The survey involved respondents from various academic fields and career levels. Our findings indicate that text-as-data methods are gaining momentum in various political science subfields and are used on a wide range of political texts. However, relevant training is not easily accessible to all. Only half of the respondents have ever participated in a training event, though there is a high demand for training opportunities in different formats and at different levels. In ‘Conclusions’, we discuss how the inaccessibility of training risks narrowing the field of researchers.
A close connection between public opinion and policy is considered a vital element of democracy. However, legislators cannot be responsive to all voters at all times with regard to the policies the latter favour. We argue that legislators use their speaking time in parliament to offer compensatory speech to their constituents who might oppose how they voted on a policy, in order to re‐establish themselves as responsive to the public's wishes. Leveraging the case of Brexit, we show that legislators pay more attention to constituents who might be dissatisfied with how they voted. Furthermore, their use of rhetorical responsiveness is contingent on the magnitude of the representational deficit they face vis‐à‐vis their constituency. Our findings attest to the central role of parliamentary speech in maintaining responsiveness. They also demonstrate that communicative responsiveness can substitute for policy responsiveness.
Debates about the European Union's democratic legitimacy put national parliaments into the spotlight. Do they enhance democratic accountability by offering visible debates and electoral choice about multilevel governance? To support such accountability, saliency of EU affairs in the plenary ought to be responsive to developments in EU governance, has to be linked to decision‐making moments and should feature a balance between government and opposition. The recent literature discusses various partisan incentives that support or undermine these criteria, but analyses integrating these arguments are rare. This article provides a novel comparative perspective by studying the patterns of public EU emphasis in more than 2.5 million plenary speeches from the German Bundestag, the British House of Commons, the Dutch Tweede Kamer and the Spanish Congreso de los Diputados over a prolonged period from 1991 to 2015. It documents that parliamentary actors are by and large responsive to EU authority and its exercise where especially intergovernmental moments of decision making spark plenary EU salience. But the salience of EU issues is mainly driven by government parties, decreases in election time and is negatively related to public Euroscepticism. The article concludes that national parliaments have only partially succeeded in enhancing EU accountability and suffer from an opposition deficit in particular.
The promises and pitfalls of automated (computer-assisted) and human-coding content analysis techniques applied to political science research have been extensively discussed in the scholarship on party politics and legislative studies. This study presents a similar comparative analysis outlining the pay-offs and trade-offs of these two methods of content analysis applied to research on EU lobbying. The empirical focus is on estimating interest groups’ positions based on their formally submitted policy position documents in the context of EU policymaking. We identify the defining characteristics of these documents and argue that the choice for a method of content analysis should be informed by a concern for addressing the specificities of the research topic covered, of the research question asked and of the data sources employed. We discuss the key analytical assumptions and methodological requirements of automated and human-coding text analysis and the degree to which they match the identified text characteristics. We critically assess the most relevant methodological challenges research designs face when these requirements need to be complied with and how these challenges might affect measurement validity. We also compare the two approaches in terms of their reliability and resource intensity. The article concludes with recommendations and issues for future research.
From the early use of TF-IDF to the high-dimensional outputs of deep learning, vector space embeddings of text, at a scale ranging from token to document, are at the heart of all machine analysis and generation of text. In this article, we present the first large-scale comparison of a sampling of such techniques on a range of classification tasks on a large corpus of current literature drawn from the well-known Books3 data set. Specifically, we compare TF-IDF, Doc2vec and several Transformer-based embeddings on a variety of text-specific tasks. Using industry-standard BISAC codes as a proxy for genre, we compare embeddings in their ability to preserve information about genre. We further compare these embeddings in their ability to encode inter- and intra-book similarity. All of these comparisons take place at the book “chunk” (1,024 tokens) level. We find Transformer-based (“neural”) embeddings to be best, in the sense of their ability to respect genre and authorship, although almost all embedding techniques produce sensible constructions of a “literary landscape” as embodied by the Books3 corpus. These experiments suggest the possibility of using deep learning embeddings not only for advances in generative AI, but also a potential tool for book discovery and as an aid to various forms of more traditional comparative textual analysis.
Supervised learning is increasingly used in social science research to quantify abstract concepts in textual data. However, a review of recent studies reveals inconsistencies in reporting practices and validation standards. To address this issue, we propose a framework that systematically outlines the process of transforming text into a quantitative measure, emphasizing key reporting decisions at each stage. Clear and comprehensive validation is crucial, enabling readers to critically evaluate both the methodology and the resulting measure. To illustrate our framework, we develop and validate a measure assessing the tone of questions posed to nominees during U.S. Senate confirmation hearings. This study contributes to the growing literature advocating for transparency and rigor in applying machine learning methods within computational social sciences.
Servitization is a key strategy for enhancing competitiveness in manufacturing, yet the managerial drivers behind this transformation remain underexplored. This study investigates the impact of top executives’ service cognition on servitization using a novel index derived from text-mined disclosures of Chinese listed manufacturing firms (2007–2020). Results show that executives’ service cognition significantly promotes servitization, even after controlling for endogeneity using instrumental variables and Heckman’s two-stage model. Mechanism analysis reveals that this cognitive orientation enhances human capital accumulation and R&D investment, which in turn drive higher service levels. Furthermore, the relationship is moderated by executive power concentration and regional internet penetration. Heterogeneity tests indicate stronger effects in high-tech industries, state-owned enterprises, and large firms. These findings highlight the critical role of executive cognition in shaping strategic transformation and offer practical implications for firms and policymakers aiming to foster servitization through leadership development and supportive digital infrastructure.
The sentiment expressed in a legislator’s speech is informative. However, extracting legislators’ sentiment requires human-annotated data. Instead, we propose exploiting closing debates on a bill in Japan, where legislators in effect label their speech as either pro or con. We utilize debate speeches as the training dataset, fine-tune a pretrained model, and calculate the sentiment scores of other speeches. We show that the more senior the opposition members are, the more negative their sentiment. Additionally, we show that opposition members become more negative as the next election approaches. We also demonstrate that legislators’ sentiments can be used to predict their behaviors by using the case in which government members rebelled in the historic vote of no confidence in 1993.
Chapter 7 builds on students’ understanding of arrays and numeric and logical data types from Chapters 2 and 4, demonstrating how to use what they already know to manipulate text in MATLAB. Text in MATLAB comes in two forms: character arrays, in which text is stored in individual letters, numbers, symbols, and spaces; and strings, in which each element of text can store any number of those characters. Differences in the utility of these structures for different tasks are discussed, as is their interchangeability when providing inputs to other MATLAB functions. Once text is introduced, students learn to interface with MATLAB via input/output features, both in the console and in pop-up windows. Lastly, because MATLAB code is also text, students learn to run text as MATLAB code, as well as potential issues with doing so and workarounds to avoid those issues.
Propagandists discredit political ideas that rival their own. In China’s state-run media, one common technique is to place the phrase so-called, in English, or 所谓, in Chinese, before the idea to be discredited. In this research note we apply quantitative text analysis methods to over 45,000 Xinhua articles from 2003 to 2022 containing so-called or 所谓 to better understand the ideas the government wishes to discredit for different audiences. We find that perceived challenges to China’s sovereignty consistently draw usage of the term and that a theme of rising importance is political rivalry with the United States. When it comes to differences between internal and external propaganda, we find broad similarities, but differences in how the US is discredited and more emphasis on cooperation for foreign audiences. These findings inform scholarship on comparative authoritarian propaganda and Chinese propaganda specifically.
A critical challenge for biomedical investigators is the delay between research and its adoption, yet there are few tools that use bibliometrics and artificial intelligence to address this translational gap. We built a tool to quantify translation of clinical investigation using novel approaches to identify themes in published clinical trials from PubMed and their appearance in the natural language elements of the electronic health record (EHR).
Methods:
As a use case, we selected the translation of known health effects of exercise for heart disease, as found in published clinical trials, with the appearance of these themes in the EHR of heart disease patients seen in an emergency department (ED). We present a self-supervised framework that quantifies semantic similarity of themes within the EHR.
Results:
We found that 12.7% of the clinical trial abstracts dataset recommended aerobic exercise or strength training. Of the ED treatment plans, 19.2% related to heart disease. Of these, the treatment plans that included heart disease identified aerobic exercise or strength training only 0.34% of the time. Treatment plans from the overall ED dataset mentioned aerobic exercise or strength training less than 5% of the time.
Conclusions:
Having access to publicly available clinical research and associated EHR data, including clinician notes and after-visit summaries, provided a unique opportunity to assess the adoption of clinical research in medical practice. This approach can be used for a variety of clinical conditions, and if assessed over time could measure implementation effectiveness of quality improvement strategies and clinical guidelines.
This paper studies the role of central bank communication for the monetary policy transmission mechanism using text analysis techniques. In doing so, we derive sentiment measures from European Central Bank (ECB)’s press conferences indicating a dovish or hawkish tone referring to interest rates, inflation, and unemployment. We provide strong evidence for predictability of our sentiments on interbank interest rates, even after controlling for actual policy rate changes. We also find that our sentiment indicators offer predictive power for professionals’ expectations, the disagreement among them, and their uncertainty regarding future inflation as well as future interest rates. Policy communication shocks identified through sign restrictions based on our sentiment measure also have significant effects on real outcomes. Overall, our findings highlight the importance of the tone of central bank communication for the transmission mechanism of monetary policy, but also indicate the necessity of refinements of the communication policies implemented by the ECB to better anchor inflation expectations at the target level and to reduce uncertainty regarding the future path of monetary policy.
It is often argued that when legislators have personal vote-seeking incentives, parties are less unified because legislators need to build bonds of accountability with their voters. I argue that these effects depend on a legislator’s ability to cultivate a personal vote. When parties control access to the ballot and the resources candidates need to cultivate personal votes, they can condition a legislator’s access to these resources on loyalty to the party’s agenda. I test this theory by conducting a difference-in-differences analysis that leverages the staggered implementation of the 2014 Mexican Electoral Reform. This reform introduced the possibility of consecutive reelection for state legislators, increasing their incentives to cultivate personal votes. I study unity in position-taking and voting behaviour of Mexican state legislators from 2012 to 2018. To analyze position-taking, I apply correspondence analysis to a new dataset of over half a million legislative speeches in twenty states. To study voting, I analyze over 14,500 roll-call votes in fourteen states during the same period. Results show that reelection incentives increased intra-party unity, which has broad implications for countries introducing electoral reforms aiming to personalize politics.
A common challenge in studying Italian parliamentary discourse is the lack of accessible, machine-readable, and systematized parliamentary data. To address this, this article introduces the ItaParlCorpus dataset, a new, annotated, machine-readable collection of Italian parliamentary plenary speeches for the Camera dei Deputati, the lower house of Parliament, spanning from 1948 to 2022. This dataset encompasses 470 million words and 2.4 million speeches delivered by 5830 unique speakers representing 77 different political parties. The files are designed for easy processing and analysis using widely-used programming languages, and they include metadata such as speaker identification and party affiliation. This opens up opportunities for in-depth analyses on a variety of topics related to parliamentary behavior, elite rhetoric, and the salience of political themes, exploring how these vary across party families and over time.
Previous accounts have suggested a potential divergence between Xi Jinping and Li Keqiang in their approaches to economic governance. This study examines the policy orientations of the two leaders concerning state–market relations, providing empirical evidence for the recent manifestation of what insiders have termed the “dispute between north and south houses” (nanbeiyuan zhi zheng) and its economic implications. By applying semi-supervised machine learning methods to textual data, this study demonstrates that Li favoured market-oriented policies, whereas Xi displayed a pronounced preference for state-centric strategies. The findings notably indicate an initial divergence in policy orientation, which was followed by a considerable convergence during Xi's second term. Our analysis further reveals that Li's market-oriented rhetoric was particularly prominent during “Mass innovation week,” indicating a campaign-style policy mobilization. Moreover, the analysis identifies that the discursive differences between the two leaders are associated with a decline in firm-level investment, suggesting that disparities in policy orientation may engender political uncertainty. This study contributes to the extant literature on the impact of leadership dynamics on economic policy, the implications of mixed signals from the central leadership and the phenomenon of campaign-style mobilization in China.
We apply moral foundations theory (MFT) to explore how the public conceptualizes the first eight months of the conflict between Ukraine and the Russian Federation (Russia). Our analysis includes over 1.1 million English tweets related to the conflict over the first 36 weeks. We used linguistic inquiry word count (LIWC) and a moral foundations dictionary to identify tweets’ moral components (care, fairness, loyalty, authority, and sanctity) from the United States, pre- and post-Cold War NATO countries, Ukraine, and Russia. Following an initial spike at the beginning of the conflict, tweet volume declined and stabilized by week 10. The level of moral content varied significantly across the five regions and the five moral components. Tweets from the different regions included significantly different moral foundations to conceptualize the conflict. Across all regions, tweets were dominated by loyalty content, while fairness content was infrequent. Moral content over time was relatively stable, and variations were linked to reported conflict events.
Stylistics is the linguistic study of style in language. Now in its second edition, this book is an introduction to stylistics that locates it firmly within the traditions of linguistics. Organised to reflect the historical development of stylistics, it covers key principles such as foregrounding theory, as well as recent advances in cognitive and corpus stylistics. This edition has been fully revised to cover all the major developments in the field since the first edition, including extensive coverage of corpus stylistics, new sections on a range of topics, additional exercises and commentaries, updated further reading lists, and an entirely re-written final chapter on the disciplinary status of stylistics and its relationship to linguistics, plus a manifesto for the future of the field. Comprehensive in its coverage and assuming no prior knowledge of the subject, it is essential reading for students and researchers new to this fascinating area of language study.
Large language models are a powerful tool for conducting text analysis in political science, but using them to annotate text has several drawbacks, including high cost, limited reproducibility, and poor explainability. Traditional supervised text classifiers are fast and reproducible, but require expensive hand annotation, which is especially difficult for rare classes. This article proposes using LLMs to generate synthetic training data for training smaller, traditional supervised text models. Synthetic data can augment limited hand annotated data or be used on its own to train a classifier with good performance and greatly reduced cost. I provide a conceptual overview of text generation, guidance on when researchers should prefer different techniques for generating synthetic text, a discussion of ethics, a simple technique for improving the quality of synthetic text, and an illustration of its limitations. I demonstrate the usefulness of synthetic training through three validations: synthetic news articles describing police responses to communal violence in India for training an event detection system, a multilingual corpus of synthetic populist manifesto statements for training a sentence-level populism classifier, and generating synthetic tweets describing the fighting in Ukraine to improve a named entity system.
Content analysis is a valuable tool for analysing policy discourse, but annotation by humans is costly and time consuming. ChatGPT is a potentially valuable tool to partially automate content analysis for policy debates, largely replacing human annotators. We evaluate ChatGPT’s ability to classify documents using pre-defined argument descriptions, comparing its performance with human annotators for two policy debates: the Universal Basic Income debate on Dutch Twitter (2014–2016) and the pension reforms debate in German newspapers (1993–2001). We use the API (GPT-4 Turbo) and user interface version (GPT-4) and evaluate multiple performance metrics (accuracy, precision and recall). ChatGPT is highly reliable and accurate in classifying pre-defined arguments across datasets. However, precision and recall are much lower, and vary strongly between arguments. These results hold for both datasets, despite differences in language and media type. Moreover, the cut-off method proposed in this paper may aid researchers in navigating the trade-off between detection and noise. Overall, we do not (yet) recommend a blind application of ChatGPT to classify arguments in policy debates. Those interested in adopting this tool should manually validate bot classifications before using them in further analyses. At least for now, human annotators are here to stay.