Hostname: page-component-89b8bd64d-nlwjb Total loading time: 0 Render date: 2026-05-07T15:37:14.884Z Has data issue: false hasContentIssue false

The role of meaning in the rivalry of -ity and -ness: evidence from distributional semantics

Published online by Cambridge University Press:  23 January 2025

MARTIN SCHÄFER*
Affiliation:
Anglistik III Heinrich Heine Universität Düsseldorf Universitätsstraße 1 40225 Düsseldorf Germany post@martinschaefer.info
Rights & Permissions [Opens in a new window]

Abstract

Both -ity and -ness are frequent and productive suffixes in English that fulfill the same core function: turning adjectives into nouns that denote the state or quality of whatever the adjective denotes. This well-known affix rivalry raises two core questions: 1. What determines the choice between -ity and -ness for a given base word? 2. Are the two affixes synonymous? For the first question, previous work has focused on morphological and phonological properties of the bases, but not their semantics. For question 2, the literature fails to give a convincing answer, with some studies, faced with doublets like ethnicity/ethnicness, arguing for a semantic difference, but most assuming synonymy. Using pretrained distributional vectors, I show empirically first that the semantics of the bases plays a major role in affix selection and second that the two affixes induce similar meaning shifts.

Information

Type
Research Article
Creative Commons
Creative Common License - CCCreative Common License - BY
This is an Open Access article, distributed under the terms of the Creative Commons Attribution licence (http://creativecommons.org/licenses/by/4.0), which permits unrestricted re-use, distribution and reproduction, provided the original article is properly cited.
Copyright
© The Author(s), 2025. Published by Cambridge University Press
Figure 0

Table 1. Toy example illustrating the first step in creating a simple distributional model: collecting cooccurrence counts for the target words, here the cooccurrences of inclusive, inclusiveness and inclusivity with the three nouns environment, part and price

Figure 1

Figure 1. A three-dimensional space showing the distributional vectors of inclusiveness, inclusivity and inclusive based on the toy data in table 1. The three dimensions stand for the cooccurrences with the nouns verbs environment, part and price

Figure 2

Table 2. Distribution of the -ity and -ness lemmata in terms of token frequency. Doublets are excluded

Figure 3

Table 3. Distribution of endings in the adjective bases of -ity/-ness derivatives, with one example for each affix. Doublets are excluded

Figure 4

Table 4. Toy confusion matrix for an LDA classifier predicting the class, -ity or -ness, for 250 vectors

Figure 5

Figure 2. Projection of the vectors of the adjective bases into two-dimensional space using the t-SNE dimension reduction technique. Bases of doublets are excluded

Figure 6

Figure 3. Projection of the vectors of the adjective bases ending in -ive into two-dimensional space using the t-SNE dimension reduction technique. Bases of doublets are excluded

Figure 7

Figure 4. Projection of the vectors of the non-doublet derivatives into two-dimensional space using the t-SNE dimension-reduction technique. Derivatives forming doublets are excluded

Figure 8

Figure 5. Projection of the vectors of derivatives of adjectives ending in -ive into two-dimensional space using the t-SNE dimension-reduction technique. Derivatives forming doublets are excluded

Figure 9

Figure 6. Density plots for cosine similarities between paired -ity and -ness bases. Derivatives forming doublets are excluded

Figure 10

Table 5. Overview of LDAs across frequency bands for both bases and derivatives, reporting the mean weighted F1-score and the standard deviation. The weighted F1-score of the baseline classifier is given in the rightmost column

Figure 11

Table 6. Cosine similarities between the base and the derivative for all -ity and -ness pairs (excluding doublets)

Figure 12

Table 7. Beta regression for cosine similarity between the non-doublets. R-sq.(adj) = 0.09 deviance explained = 9.81%

Figure 13

Figure 7. Density plot of the log frequencies of the ity and ness derivatives in the doublets

Figure 14

Table 8. Distribution of endings in the bases of all doublets

Figure 15

Figure 8. Projection of the vectors of all doublets into two-dimensional space using the t-SNE dimension reduction technique

Figure 16

Table 9. Illustration of doublets across the distribution of cosine similarities within doublets. The two doublets closest to the respective cosine similarity values have been selected

Figure 17

Table 10. Distribution of cosine similarities within -ive doublets

Figure 18

Figure 9. Interaction plot of log frequencies of -ity and -ness derivatives in the beta regression model for doublet similarity. The individual panels show the relationship between the -ity log frequency and the cosine similarity within doublets for specific values of -ness log frequencies. These -ness log frequencies values increase across the panels, from left to right, as shown at the top of the panels

Figure 19

Table 11. Beta regression for cosine similarity between the doublets. R-sq.(adj) = 0.137 deviance explained = 16.4%

Figure 20

Figure 10. Projection of adjective bases from three different semantic classes of adjectives: color (6 bases), human propensity (9 bases), and speed (9 bases) into two-dimensional space using the t-SNE dimension reduction technique

Figure 21

Figure 11. Projection of the vectors of all bases into two-dimensional space using the t-SNE dimension reduction technique. All -ble bases are labeled

Figure 22

Figure 12. Projection of the vectors of all bases into two-dimensional space using the t-SNE dimension reduction technique. All -ful and -ish bases are labeled

Figure 23

Figure 13. Projection of the vectors of all bases into two-dimensional space using the t-SNE dimension reduction technique. All -ive bases are labeled

Figure 24

Figure 14. ADJ bases, ultralow frequency, no doublets. Corresponding mean weighted LDA scores: 0.686, standard deviation 0.061

Figure 25

Figure 15. ADJ derivatives, ultralow frequency, no doublets. Corresponding mean weighted LDA scores: 0.693, standard deviation 0.066

Figure 26

Figure 16. ADJ bases, low frequency, no doublets. Corresponding mean weighted LDA scores: 0.816, 0.04 standard deviation

Figure 27

Figure 17. ADJ derivatives, low frequency, no doublets. Corresponding mean weighted LDA scores: 0.821, 0.032 standard deviation

Figure 28

Figure 18. ADJ bases, mid frequency, no doublets. Corresponding mean weighted LDA scores: 0.753, 0.057 standard deviation

Figure 29

Figure 19. ADJ derivatives, mid frequency, no doublets. Corresponding mean weighted LDA scores: 0.747,0.057 standard deviation

Figure 30

Figure 20. ADJ bases mid high frequency, no doublets. Corresponding mean weighted LDA scores: 0.665, 0.079 standard deviation

Figure 31

Figure 21. ADJ derivatives, mid high frequency, no doublets. Corresponding mean weighted LDA scores: 0.683, 0.08 standard deviation

Figure 32

Figure 22. ADJ bases, high frequency, no doublets. Corresponding mean weighted LDA scores: 0.678, standard deviation 0.072

Figure 33

Figure 23. ADJ derivatives, high frequency, no doublets. Corresponding mean weighted LDA scores: 0.645, 0.094 standard deviation

Figure 34

Figure 24. ADJ bases, super high frequency, no doublets. Corresponding mean weighted LDA scores: 0.756, standard deviation 0.075

Figure 35

Figure 25. ADJ derivatives, super high frequency, no doublets. Corresponding mean weighted LDA scores: 0.777, standard deviation 0.102

Figure 36

Figure 26. ADJ bases ultrahigh, no doublets. Corresponding mean weighted LDA scores: 0.785, 0.09 standard deviation

Figure 37

Figure 27. ADJ derivatives, ultrahigh frequency, no doublets. Corresponding mean weighted LDA scores: 0.805, 0.087 standard deviation