Hostname: page-component-6766d58669-l4t7p Total loading time: 0 Render date: 2026-05-19T11:39:27.484Z Has data issue: false hasContentIssue false

Semantic enrichment of neural word embeddings: Leveraging taxonomic similarity for enhanced distributional semantics

Published online by Cambridge University Press:  30 July 2025

Dongqiang Yang*
Affiliation:
School of Computer Science and Technology, Shandong Jianzhu University, Jinan, China
Xinru Zhang
Affiliation:
School of Computer Science and Technology, Shandong Jianzhu University, Jinan, China
Tonghui Han
Affiliation:
School of Computer Science and Technology, Shandong Jianzhu University, Jinan, China School of Computing, Engineering and the Built Environment, Ulster University, Belfast, UK
Yi Liu
Affiliation:
School of Computer Science and Technology, Shandong Jianzhu University, Jinan, China
*
Corresponding author: Dongqiang Yang; Email: dongqiang.yang@gmail.com
Rights & Permissions [Opens in a new window]

Abstract

Data-driven neural word embeddings (NWEs), grounded in distributional semantics, can capture various ranges of linguistic regularities, which can be further enriched by incorporating structured knowledge resources. This work proposes a novel post-processing approach for injecting semantic relationships into the vector space of both static and contextualized NWEs. Current solutions to retrofitting (RF) word embeddings often oversimplify the integration of semantic knowledge, neglecting the nuanced differences between relationships, which may result in suboptimal performance. Instead of applying multi-thresholding to distance boundaries in metric learning, we compute taxonomic similarity to dynamically adjust these boundaries during the semantic specialization of word embeddings. Benchmark evaluations on both static and contextualized word embeddings demonstrate that our dynamic-fitting (DF) approach produces SOTA correlation results of 0.78 and 0.76 on SimLex-999 and SimVerb-3500, respectively, highlighting the effectiveness of incorporating multiple semantic relationships in refining vector semantics. Our approach also outperforms existing RF methods in both supervised and unsupervised semantic relationships recognition tasks. It achieves top accuracy scores for hypernymy detection on the BLESS, WBLESS, and BIBLESS datasets (0.97, 0.89, and 0.83, respectively) and an F1 score of over 0.60 on four types of semantic relationship classification in the shared Subtask-2 of CogALex-V, surpassing all participant systems. In the analogy reasoning task of the Bigger Analogy Test Set, our approach outperforms existing RF methods on inferring relational similarity. These consistent improvements across various lexical semantics tasks suggest that our DF approach can effectively integrate distributional semantics with symbolic knowledge resources, thereby enhancing the representation capacity of word embeddings in downstream applications.

Information

Type
Article
Creative Commons
Creative Common License - CCCreative Common License - BY
This is an Open Access article, distributed under the terms of the Creative Commons Attribution licence (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted re-use, distribution and reproduction, provided the original article is properly cited.
Copyright
© The Author(s), 2025. Published by Cambridge University Press
Figure 0

Figure 1. Illustration of our proposed method on semantically retrofitting word embeddings. In a vector space model, $w$, $w^{+}$, and $w^{-}$ stand for a target word, its positive or related sample, and its negative or unrelated sample, respectively. In distance metric learning, $Margin(w, w^{+})$ refers to the distributional distance ($D$) boundary or margin between ${D(w, w^{+}})$ and ${D(w, w^{-}})$ in the triplet loss. We propose adjusting this margin adaptively using taxonomic similarity calculation, where $SD$ indicates the shortest distance between $w$ and $w^{+}$ within a semantic network.

Figure 1

Figure 2. A flowchart of dynamic fitting on neural word embeddings.

Figure 2

Table 1. Evaluation tasks in the lexical entailment recognition and lexical relationship classification

Figure 3

Table 2. Measuring semantic similarity using different retrofitting methods on NWEs

Figure 4

Table 3. Results of different retrofitting methods in lexical entailment recognition

Figure 5

Figure 3. Ablation study on the different effects of semantic constraints on retrofitting vector semantics. Full stands for using all the semantic relationships: the IS-A link (hypernymy hierarchy) and syn/antonymy in DF.

Figure 6

Figure 4. The effect of hypernymy directionality function in DF on identifying lexical entailment.

Figure 7

Table 4. Results of different retrofitting methods on CogALex-V. Each cell indicates the weighted F1 scores for Subtask-1/Subtask-2. The best scores for NWEs are highlighted in bold, with SOTA results from respective papers included for comparison

Figure 8

Table 5. Performance of DF-retrofitted GPT on Subtask-2 of CogALex-V. The left shows precision (P), recall (R), and F1 scores for each relationship after removing random pairs as noise, while the right presents the confusion matrix of classification, where each row corresponds to the actual counts of each relationship in the dataset, and each column corresponds to the predicted counts

Figure 9

Table 6. Results of different retrofitting methods on lexical analogy reasoning conditioning on hypernymy. We only list the top 6 candidates in the varied distributional spaces of GloVe. We colored the same candidates

Figure 10

Table 7. Results of different retrofitting methods on lexical analogy reasoning conditioning on synonymy. We only list the top 6 candidates in the varied distributional spaces of GloVe. We colored the same candidates

Figure 11

Table 8. Results of different retrofitting methods on lexical analogy reasoning conditioning on antonymy. We only list the top 6 candidates in the varied distributional spaces of GloVe. We colored the same candidates

Figure 12

Figure 5. Results of retrofitted GloVe for three types of semantic relations in lexical analogy reasoning.

Figure 13

Table 9. Analogy reasoning methods on BATS. The best results are in black font. The results of $\mathrm{BERT} _{100}^{\max }$ are F1 values