To save content items to your account,
please confirm that you agree to abide by our usage policies.
If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account.
Find out more about saving content to .
To save content items to your Kindle, first ensure no-reply@cambridge.org
is added to your Approved Personal Document E-mail List under your Personal Document Settings
on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part
of your Kindle email address below.
Find out more about saving to your Kindle.
Note you can select to save to either the @free.kindle.com or @kindle.com variations.
‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi.
‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.
As generative AI technologies continue to advance at a rapid pace, they are fundamentally transforming the dynamics of human–AI interaction and collaboration, a phenomenon that was once relegated to the realm of science fiction. These developments not only present unprecedented opportunities but also introduce a range of complex challenges. Key factors such as trust, transparency, and cultural sensitivity have emerged as essential considerations in the successful adoption and efficacy of these systems. Furthermore, the intricate balance between human and AI contributions, the optimization of algorithms to accommodate diverse user needs, and the ethical implications of AI’s role in society pose significant challenges that require careful navigation. This chapter will delve into these multifaceted issues, analyzing both user-level concerns and the underlying technical and psychological dynamics that are critical to fostering effective human–AI interaction and collaboration.
The last decade has seen an exponential increase in the development and adoption of language technologies, from personal assistants such as Siri and Alexa, through automatic translation, to chatbots like ChatGPT. Yet questions remain about what we stand to lose or gain when we rely on them in our everyday lives. As a non-native English speaker living in an English-speaking country, Vered Shwartz has experienced both amusing and frustrating moments using language technologies: from relying on inaccurate automatic translation, to failing to activate personal assistants with her foreign accent. English is the world's foremost go-to language for communication, and mastering it past the point of literal translation requires acquiring not only vocabulary and grammar rules, but also figurative language, cultural references, and nonverbal communication. Will language technologies aid us in the quest to master foreign languages and better understand one another, or will they make language learning obsolete?
AI is evolving rapidly and is poised to have far-reaching societal and global impacts, including in the military domain. AI offers cognitive reasoning and learning about problem domains –processing large quantities of data to develop situational awareness, generate solution goals, recommend courses of action, and provide robotic systems with the means for sense-making, guidance, actions, and autonomy. This chapter explores metacognition – an emerging and revolutionary technology that is enabling AI to become self-aware – to think and reason about its own cognition. This chapter explores metacognition applications in the military domain, focusing on four areas: (1) improving human interaction with AI systems, (2) providing safe and ethical AI behavior, (3) enabling autonomous systems, and (4) improving automated decision aids. The chapter begins with an overview of foundational AI and metacognition concepts, followed by a discussion of the potential contribution of metacognition to improve military operations. The chapter concludes with speculations concerning the more distant future of metacognition and its implications on AI systems and warfare.
We study the performance of a commercially available large language model (LLM) known as ChatGPT on math word problems (MWPs) from the dataset DRAW-1K. To our knowledge, this is the first independent evaluation of ChatGPT. We found that ChatGPT’s performance changes dramatically based on the requirement to show its work, failing $20\%$ of the time when it provides work compared with $84\%$ when it does not. Further, several factors about MWPs relate to the number of unknowns and number of operations that lead to a higher probability of failure when compared with the prior, specifically noting (across all experiments) that the probability of failure increases linearly with the number of addition and subtraction operations. We also have released the dataset of ChatGPT’s responses to the MWPs to support further work on the characterization of LLM performance and present baseline machine learning models to predict if ChatGPT can correctly answer an MWP.
This chapter introduces the concept of metacognition from a cognitive perspective, where it refers to knowledge and mental processes that operate on one’s own cognition. We review different forms of metacognition that involve distinct types of explicit reasoning and automatic processes, as well as various measures and functional benefits. We articulate four conjectures regarding the nature of metacognition in the specific context of the ACT-R cognitive architecture: (1) it involves extracting information about processes in cognitive modules; (2) the information is quantitative and approximate rather than symbolic; (3) the metacognitive information is available in working memory for cognitive processing; and (4) general cognitive processes are sufficient to respond to a situation detected by metacognitive monitoring. We illustrate these principles with examples of past work involving neuro-symbolic models of perception and introspection into declarative models of decision-making. Finally, we situate this approach within the context of theories such as predictive coding and the Common Model of Cognition encompassing other cognitive architectures.
Metacognitive AI is closely connected to certifiable AI and trustworthy AI, the two areas focusing on equipping AI with trustworthy guarantees in high-stake domains. This chapter provides a systematic overview, tutorial, and discussion of the certified approaches in trustworthy deep learning. The chapter introduces essential terminologies, core methodologies, and representative applications of certified approaches. We believe that certified approaches, as a prerequisite for deploying AI in high-stake and safety-critical applications, would be an essential tool in metacognitive AI, and we hope that this chapter can inspire readers to further advance the field of certifiable trustworthiness for metacognitive AI.
This chapter presents a metacognitive AI approach via formal verification and repair of neural networks (NNs). We observe that a neural network repair is a form of metacognition, where trained AI systems relearn until specifications hold. We detail Veritex, a tool for reachability analysis and repair of deep NNs (DNNs). Veritex includes methods for exact and over-approximative reachability analysis of DNNs. The exact methods can compute the exact output reachable domain, as well as the exact unsafe input space that causes safety violations of DNNs. Based on the exact unsafe input–output reachable domain, Veritex can repair unsafe DNNs on multiple safety properties with negligible performance degradation, by updating the DNN parameters via retraining. Veritex primarily addresses the synthesis of provably safe DNNs, which is not yet significantly addressed in the literature. Veritex is evaluated for safety verification and DNN repair. Benchmarks for verification include ACAS Xu, and benchmarks for the repair include an unsafe ACAS Xu and an unsafe agent trained in deep reinforcement learning (DRL), where it is able to modify the NNs until safety is proven.
In this chapter, we use task failure as a trigger to engage in metacognitive processes. We present a procedure by which an agent may exploit failure in the zero-shot outputs of LLMs as a trigger to investigate alternative solutions to the problem using object interactions and knowledge of the object semantics. We additionally propose a method through which knowledge gained from the object interactions can be distilled back into the LLM and avenues for future research.
We investigate the incorporation of metacognitive capabilities into Machine Learning Integrated with Network (MLIN) systems and develop machine Learning Integrated with Knowledge (mLINK) strata. This stratum is aimed at integrating knowledge obtained from multiple MLIN elements and reflecting on the ML application performance outcomes in order to provide feedback on metacognitive actions aimed at ensuring performance and improving ML application robustness towards Data Quality (DQ) variations. We discuss multiple use cases to show how the knowledge on the interrelationships between MLIN components, DQ, and ML application performance can be generated and employed by mLINK. We elaborate on how this knowledge is integrated into mLINK to produce metaknowledge, deemed as recommendations on adaptation actions or strategies needed. We define the process of employing these recommendations by mLINK as metacognition and describe multiple examples of utilizing these metacognitive strategies in practice, such as optimizing the data collection; reflection on DQ; DQ assurance; enhanced transfer learning; and Federated Learning for enhancing security, privacy, collaboration, and communication in MLIN.
To enhance understanding and collaboration with autonomous agents, it is crucial to construct a representation of their task strategies that integrates interpretability, monitoring, and formal reasoning. This dual-purpose representation fosters human comprehension and enables automated analytical processes. We achieve this balance by formalizing task strategies through temporal logic formulas. Recent trends emphasize inferring temporal logic formulas from data to explain system behaviors and assess autonomous agents’ competencies. Our methodology relies on positive and negative examples from system observations to construct a concise temporal logic formula consistent with the data. However, existing approaches often overlook real-world data’s noise and uncertainties, limiting practical deployment. Addressing this, we analyze labeled trajectories and aim to infer interpretable formulas that minimize misclassification loss. To tackle data uncertainties, we focus on labeled interval trajectories. Our algorithm maximizes the worst-case robustness margin, enhancing formula robustness and ensuring the adaptability and reliability of temporal logic inference in real-world applications.
One of the central aspects of metacognitive AI is the AI agent’s ability to reason about its own behavior. In particular, for AI systems to be deployed in real-world applications with high impact, it is crucial that we can reason about and guarantee their fairness and robustness. Here, we provide a probabilistic reasoning framework to audit and enforce fairness of automated decision-making systems, using classifiers as the main example, while being robust to uncertainties and noise in the distribution.
By integrating hard constraints into neural network outputs, we not only improve the reliability of AI systems but also pave the way for meta-cognitive capabilities that ensure the alignment of predictions with domain-specific knowledge.
This topic has received a lot of attention, however, existing methods either impose the constraints in a “weak” form at training time, with no guarantees at inference, or fail to provide a general framework that supports different tasks and constraint types.
We tackle this open problem from a neuro-symbolic perspective, developing a pipeline that enhances a conventional neural predictor with a symbolic reasoning module capable of correcting structured prediction errors and a neural attention module that learns to direct the reasoning effort to focus on potential prediction errors, while keeping other outputs unchanged.
This framework provides an appealing trade-off between the efficiency of constraint-free neural inference and the prohibitive cost of exhaustive reasoning at inference time that satisfies the rigorous demands of meta-cognitive assurance.
Text-to-image (T2I) diffusion models require large-scale training data to achieve such good performance. Still, they seem to lack a common understanding of semantics such as spatial composition, and spurious correlations raising ethical concerns. Data and model size do not matter in learning better semantics; instead, they seem to hurt the model. Recent works have shown the few-shot concept learning abilities of T2I models on simple concepts like cat or dog. Following the line of research, we introduce in this chapter utilizing Concept Algebra for learning new concepts in a resource-efficient way.
To do that, we introduce three works focusing on concept learning to show its effectiveness: (1) Create a benchmark for large-scale evaluations of concept learning methodologies, (2) Reduce ethical biases via Concept Algebra via few-shot concept learning, and (3) Learn spatial relationships via few-shot concept adaptation. Through this research, we describe the efforts to create few-shot synthetic data that is both robust and reduces biases present in various forms.
AI systems have struggled to be deployed in safety-critical applications where the consequences of incorrect predictions are severe. In complex applications and environments, like autonomous driving, it is often impossible or impractical to curate a dataset or simulator that sufficiently spans the entire input space, making it improbable that a perfect agent can be trained offline. Metacognitive AI represents an approach to design agents that continue safely learning and adapting as they encounter new or uncertain scenarios in the environment, which improves their performance over time. A key component to achieve this behavior is quantifying the AI agent’s prediction uncertainty to enable the agent to understand when it is operating in a previously unseen scenario. In this chapter, we discuss a framework for creating a metacognitive agent and delve deeper into Meta Modeling, a method for augmenting existing neural networks with uncertainty quantification. Our approach provides a first step toward realizing a metacognitive AI agent.