Hostname: page-component-77f85d65b8-2tv5m Total loading time: 0 Render date: 2026-04-17T17:17:28.395Z Has data issue: false hasContentIssue false

Adversarial natural language processing: overview, challenges, and policy implications

Published online by Cambridge University Press:  22 September 2025

Laxmi Shaw
Affiliation:
Department of Information Systems and Analytics, Texas State University, San Marcos, TX, USA
Mohammed Wasim Ansari
Affiliation:
Department of Information Systems and Analytics, Texas State University, San Marcos, TX, USA
Tahir Ekin*
Affiliation:
Department of Information Systems and Analytics, Texas State University, San Marcos, TX, USA
*
Corresponding author: Tahir Ekin; Email: t_e18@txstate.edu

Abstract

The emergence of large language models has significantly expanded the use of natural language processing (NLP), even as it has heightened exposure to adversarial threats. We present an overview of adversarial NLP with an emphasis on challenges, policy implications, emerging areas, and future directions. First, we review attack methods and evaluate the vulnerabilities of popular NLP models. Then, we review defense strategies that include adversarial training. We describe major policy implications, identify key trends, and suggest future directions, such as the use of Bayesian methods to improve the security and robustness of NLP systems.

Information

Type
Research Article
Creative Commons
Creative Common License - CCCreative Common License - BY
This is an Open Access article, distributed under the terms of the Creative Commons Attribution licence (http://creativecommons.org/licenses/by/4.0), which permits unrestricted re-use, distribution and reproduction, provided the original article is properly cited.
Open Practices
Open data
Copyright
© The Author(s), 2025. Published by Cambridge University Press
Figure 0

Figure 1. Taxonomy of adversarial attacks and defenses in NLP.

Figure 1

Table 1. Review highlights of adversarial attacks in NLP

Figure 2

Table 2. Review highlights of adversarial defenses in NLP

Figure 3

Figure 2. Adversarial sentiment analysis example on IMDB dataset (Shaw et al., 2024).

Figure 4

Table 3. Select empirical results before and after attacks against ensembles of CNN and BiLSTM

Submit a response

Comments

No Comments have been published for this article.