In the current technological landscape, the field of artificial intelligence (AI), specifically natural language processing (NLP), is experiencing a paradigm shift from technology-centric to human-centric (Shneiderman, Reference Shneiderman2022). This shift is spurred by the growing recognition that purely technical uses of AI can reinforce bias, be inscrutable, and lack human values. The evolution of generative AI took this challenge to new heights, transforming AI into a ‘partner’ that can collaborate deeply with humans (Krakowski, Reference Krakowski2025; Wessel et al. Reference Wessel, Adam, Benlian, Majchrzak and Thies2025), along with AI pedagogy and governance frameworks (Capel and Brereton, Reference Capel and Brereton2023; Schmager, Pappas, and Vassilakopoulou, Reference Schmager, Pappas and Vassilakopoulou2025). Consequently, a central challenge has emerged: how to ensure these powerful language technologies are designed and applied in a manner that is fair, transparent, interpretable, and respectful of cultural diversity. In response to this, Peng Wang and Pete Smith’s Multilingual Artificial Intelligence (Wang and Smith, Reference Wang and Smith2025) emerges not merely as a technological guide, but also a conceptual exploration of how to practice the ‘human-centered AI’ philosophy in a multilingual, multicultural world. It provides a strategic road map rather than a hands-on programming manual, aiming to bridge the gap between the diverse stakeholders in the multilingual communication process. Ultimately, the authors frame multilingual AI as a dual-purpose tool: one that not only enhances human productivity but also acts as a reflective medium to understand our own cognition.
The book’s conceptual map is elegantly structured into a three-part journey that guides readers from foundational theory to applications, and finally, into an analytical and humanistic reflection. Such an organization, which keeps the humanistic perspective on technology in mind for a general reader, sets the book apart from overly technical NLP engineering manuals on the one hand and purely theoretical critiques of AI in the absence of a technological context on the other hand. Part One, ‘Fundamentals of multilingual artificial intelligence’ (Chapters 1-4), establishes the theoretical groundwork for the book. It starts unconventionally but effectively in Chapter 1 by re-describing multilingual AI as not solely a computational issue, but as a communication process governed by Shannon’s (Shannon, Reference Shannon1948) mathematical theory. By positioning the Shannon-Weaver model and the concept of the ‘noisy channel’ theory as the theoretical focus, the authors establish a powerful, unifying metaphor for the human–machine interactions that follow. From this conceptual anchor, the book progresses by outlining the data landscape (Chapter 2) and the elementary paradigms of AI (Chapter 3), presenting linguistic data as the ‘great open basement’—a common ground for communication between deductive, human symbolic reasoning and inductive, data-driven machine learning. The first part culminates in Chapter 4 by tracing the representation of meaning from the structuralist and semiotic theories of Saussure (De Saussure, Reference De Saussure1989) and Peirce (Peirce, Reference Peirce1931–1958) to modern vector semantics. A word2vec case study effectively demonstrates how contextual meaning can be extracted with machines, learning successfully to ‘know a word by the company it keeps’ (p. 81).
Part Two, ‘Large Language Models (LLMs): theories and applications’ (Chapters 5-7), addresses the technical heart of NLP as it stands today. Chapter 5 masterfully grounds machine learning in the information theory and the probabilistic basis of the noisy channel model introduced earlier. From this theoretical anchor, it lays out the progression from the simple Perceptron to the pivotal Transformer architecture, detailing crucial considerations for creating multilingual models. Crucially, it demystifies the two primary methods for exploring LLMs—fine-tuning and prompt engineering—rooting them in a practical case study on machine translation quality evaluation. Building on this, Chapters 6 and 7 then shift from fundamental theory to advanced applications. The authors contrast traditional information retrieval (IR) with modern, embedding-based vector search, and then explore how to augment LLM performance by integrating external human knowledge. This section culminates with a timely and practical case study on retrieval-augmented generation (RAG), vividly demonstrating how a knowledge graph can be used to provide an LLM with secure access to private data it was never trained on, thereby improving its accuracy and explainability.
Part Three, ‘Culture and multicultural AI’ (Chapters 8-10), represents the book’s most distinctive and forward-looking contribution in pedagogy and policy, moving beyond linguistic diversity to the far more complex challenge of cultural representation. Chapter 8 offers a critical examination of the training data behind today’s models, highlighting the biases embedded in corpora like Wikipedia and the resource inequality that creates a digital divide between high-resource ‘Winners’ and low-resource ‘Left-Behinds’. This is followed by a discussion of complex issues like the ‘curse of multilinguality’. Chapter 9 then pivots to the core challenge: are today’s LLMs truly multicultural? (p. 54) Using frameworks like the Hofstede model and the World Values Survey, the authors present evidence that current models are biased heavily toward ‘WEIRD’ (Western, Educated, Industrialized, Rich, and Developed) value systems. The book concludes with a forward-looking perspective in Chapter 10, examining the future of AI ethics in pedagogy, the need for multicultural AI competence frameworks, and the indispensable role of linguists and cultural specialists in informing the ethical development of these powerful technologies.
The defining features of this book, including its interdisciplinary approach connecting linguistic theory to AI, a detailed deep focus on multilingual and multicultural LLMs, and a comprehensive use of real-world case studies, are palpable throughout. The significance can be stated as follows: First, it excels in making esoteric computational concepts accessible by ingeniously using theoretical models from linguistics and communication theory, such as Shannon’s ‘noisy channel’ metaphor (Chapter 1), as a conceptual stepping stone to intricate technical content. Second, it takes a practical and instructive stance by tirelessly linking theory to cutting-edge practice. Each core chapter grounds its coverage in concrete case studies on topics like machine translation quality evaluation (Chapter 5), cross-lingual information retrieval (Chapter 6), and retrieval-augmented generation (RAG) (Chapter 7) to foster a fine-grained comprehension of real-world applications. In addition, it adopts a future-oriented and human-centric perspective, exploring crucial current issues such as cultural bias in LLMs, emerging AI policy, and the future role of language experts in the age of AI, as discussed in its final three chapters, thus providing a comprehensive guide to both the technical and societal dimensions of multilingual AI.
Nonetheless, the book’s strength as a conceptual road-map for a non-specialist reader also marks its boundaries, involving a deliberate trade-off of breadth for depth. This manifests in two significant ways. First, in its technical scope, the book prioritizes a strategic, high-level perspective at the expense of pragmatic detail. As such, readers spurred to implement the discussed techniques will find no code snippets or programming instructions. Thus, the book explains the ‘what’ and ‘why’ but not the ‘how’ of multilingual AI. For that, practitioners would need to turn to hands-on guides like Natural Language Processing with Transformers (Tunstall, VonWerra and Wolf, Reference Tunstall, VonWerra and Wolf2022) or basic textbooks like Python for Linguists (Hammond, Reference Hammond2020) and Natural Language Processing with Python (Bird, Klein, and Loper, Reference Bird, Klein and Loper2009), which provide the very coding skills necessary to turn theory into practice. This focus on broad principles instead of specific techniques, though, is also a wise concession to the second challenge: timeliness. In a field where specific models are quickly superseded, the book’s value lies in being a primer on foundational concepts rather than a constantly updated technical manual. Second, this high-level approach extends to its commendable section on cultural and ethical dimensions. While the book raises crucial questions, it does not delve into the granular, algorithmic solutions that are beginning to emerge. To unpack the mechanics of AI ethics, Donghee Shin’s Debiasing AI (Shin, Reference Shin2025) offers a theoretical and empirical foundation for AI ethics itself—examining how human biases are embedded, amplified, and propagated through AI systems. For readers seeking a deeper exploration of how social values can be algorithmically embedded, a more specialized text like The Ethical Algorithm (Kearns and Roth, Reference Kearns and Roth2019) is essential. It offers clear algorithmic strategies—such as differential privacy, contextual bandits, and model fairness constraints—that directly address how to encode social values like fairness, privacy, and accountability into the logic of algorithms themselves, translating ethical principles into practice. Similarly, for a broader perspective on the ethics of digital research methodology, Salganik’s Bit by Bit (Salganik, Reference Salganik2019) situates AI ethics within the broader landscape of digital social research, covering a wide spectrum of methodologies from large-scale surveys and digital experiments to mass collaboration. This framework is not limited to algorithms but encompasses the entire research life cycle, including ethical challenges in data collection (like the sensitive issue of web scraping) and the communication of results. This comparison is not intended to find fault with Wang and Smith but to suggest that the book is an excellent starting reference and not the definitive commentary on the highly specialized and complicated domain of AI ethics. These limitations, however, do not detract from the book’s core achievement. They are the result of a deliberate choice to create a conceptual and accessible bridge for a non-specialist audience. It defines its niche clearly, prioritizing the human-centered ‘why’ over the engineering-focused ‘how’.
In conclusion, Wang and Smith’s choice to prioritize the human-centered ‘why’ over the engineering-focused ‘how’ is both the book’s greatest strength and its defining limitation. It is this very choice that makes Multilingual Artificial Intelligence a landmark contribution that successfully bridges the gap between the humanities and computational science. It is more than a textbook; it is a timely manifesto for a more human-centered approach to NLP, offering a worthy contribution to the fields of computational linguistics, translation studies, and digital humanities. Without question, this volume will serve as a standard reference for graduate students, researchers, and practitioners interested not only in the ‘how’ of AI but, more importantly, in the ‘why’ and the ‘what for’.
Acknowledgments
We are indebted to the book review editor, Eugenio Martínez Cámara, for his insightful comments and suggestions on a previous draft.
Competing interests
The authors hereby affirm that there are no competing interest pertaining to this manuscript.