Hostname: page-component-6766d58669-88psn Total loading time: 0 Render date: 2026-05-15T22:18:43.888Z Has data issue: false hasContentIssue false

Survey in characterizing semantic change

Published online by Cambridge University Press:  23 April 2026

Jader Martins Camboim de Sá*
Affiliation:
Luxembourg Institute of Science and Technology , Esch-sur-Alzette, L4362, Luxembourg University of Luxembourg, Esch-sur-Alzette, L4365, Luxembourg
Marcos Da Silveira
Affiliation:
Luxembourg Institute of Science and Technology , Esch-sur-Alzette, L4362, Luxembourg
Cédric Pruski
Affiliation:
Luxembourg Institute of Science and Technology , Esch-sur-Alzette, L4362, Luxembourg
*
Corresponding author: Jader Martins Camboim de Sá; Email: jader.martins@list.lu
Rights & Permissions [Opens in a new window]

Abstract

Living languages continuously evolve to reflect the cultural changes of human societies. This evolution manifests through neologisms (new words) or the semantic change of existing words (new meanings for existing words). Understanding the meaning of words is vital for interpreting texts from different cultures (regionalisms or slang), domains (e.g., technical terms), or time periods. In computer science, this phenomenon is relevant to computational linguistics tasks such as machine translation, information retrieval, and question answering. Semantic change can impact the performance of these applications, making it important to understand and characterize these changes formally. This problem has recently attracted significant attention from the computational linguistics community. Several approaches can detect semantic changes with good precision, but more effort is needed to characterize how word meanings change and to determine how to mitigate the impact of this phenomenon. This survey provides a comprehensive overview of existing approaches to the characterization of semantic change. We also formally define three classes of characterization: change in dimension (whether a word’s meaning becomes broader or narrower), change in orientation (whether a word acquires a more pejorative or ameliorative sense), and change in relation (whether a word is used in a new figurative context, such as a metaphor or metonymy). We demonstrate the applicability of this formalism on existing corpora, summarize the key aspects of selected publications, and discuss current needs and trends in research on semantic change characterization.

Information

Type
Survey Paper
Creative Commons
Creative Common License - CCCreative Common License - BY
This is an Open Access article, distributed under the terms of the Creative Commons Attribution licence (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted re-use, distribution and reproduction, provided the original article is properly cited.
Copyright
© The Author(s), 2026. Published by Cambridge University Press
Figure 0

Figure 1. Taxonomy for the poles of Lexical Semantic Change, based on the work of Blank et al. (2003); Koch (2016); Hock and Joseph (2019).

Figure 1

Figure 2. Change in meaning and orientation for the word awful. On the left side, we reproduce the figure from Hamilton et al. (2016b) that shows the evolution of the word ‘awful’ in the embedding space. On the right side, we present the hypothetical function ($\mathfrak{f}$) for this word over time.

Figure 2

Table 1. Search terms used to discover articles related to change characterization

Figure 3

Figure 3. Graph of selected works (green) and related articles (blue).

Figure 4

Table 2. Overview of Various Corpora for Diachronic Studies

Figure 5

Figure 4. Figure adapted from Inoue et al. (2022). The stacked bar plots represent the topics obtained over time for the words ‘coach’, ‘record’, and ‘power’ respectively. We can observe new senses emerging and becoming dominant.

Figure 6

Figure 5. Adaptation from Ehmüller et al. (2020). Ego-network, built from the word co-occurrence graph, for ‘mouse’. We observe that in 1830 it was used with the sense of ‘weak’ and ‘rat,’ where in 1960 the sense of ‘computer device’ emerged.

Figure 7

Figure 6. Adapted from Hamilton et al. (2016a). The SentiProp algorithm propagates the polarity from seed words based on the distance of the connected nodes. Words are connected based on co-occurrence statistics.

Figure 8

Figure 7. Plot obtained from Moss (2020). The author represents words as Gaussian embeddings, and analyzes it’s variance and proximity over time. The word gay increased in variance and got closer to ‘homosexual’.

Figure 9

Figure 8. Adaptation from Fonteyn and Manjavacas (2021). The line plot shows the evolution of the polarity of the multi-word ‘to death.’ It went from a negative concept to a more positive one, ameliorating the dominant sense.

Figure 10

Figure 9. Illustration adapted from Xie et al. (2019). The figure shows how moral sentiment toward slavery, democracy, and gay rights evolved over two centuries, mapped in a 2D embedding space.

Figure 11

Table 3. Comparison table between studies highlighting the type of methods utilized and category of change. We mark which kind of characterization the work conducts $\mathscr{D}$ for dimension, $\mathscr{R}$ for relation, and $\mathscr{O}$ for orientation. Also, we highlight the main representation method ($\mathfrak{R}$) for the word meaning

Figure 12

Table 4. Pros and Cons of Representation Methods for Semantic Change Characterization

Figure 13

Figure 10. Definition of semantic change adapted from Koch (2016, p 23, 25).

Figure 14

Figure 11. Illustrative example of the word ‘heart’ changing over time. Metaphorization, amelioration and broadening can occur for the same word, depending on the senses it gained/lost.

Figure 15

Table 5. Distribution of WordNet senses for the word ‘heart’ in SEMCOR and MASC. The score column indicates a positive score for that sense from SentiWordNet, figurative column indicates if the meaning is figurative or literal

Figure 16

Table 6. Distribution of WordNet senses for the word ‘plane’ in SEMCOR and MASC. The score column indicates a positive score for that sense from SentiWordNet, and figurative column indicates if the meaning is figurative or literal

Figure 17

Figure 12. Change in meaning distribution for the word ‘heart’ in SEMCOR and MASC.

Figure 18

Figure 13. Change in meaning distribution for the word ‘plane’ in SEMCOR and MASC.