To save content items to your account,
please confirm that you agree to abide by our usage policies.
If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account.
Find out more about saving content to .
To save content items to your Kindle, first ensure no-reply@cambridge.org
is added to your Approved Personal Document E-mail List under your Personal Document Settings
on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part
of your Kindle email address below.
Find out more about saving to your Kindle.
Note you can select to save to either the @free.kindle.com or @kindle.com variations.
‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi.
‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.
One of the central aspects of metacognitive AI is the AI agent’s ability to reason about its own behavior. In particular, for AI systems to be deployed in real-world applications with high impact, it is crucial that we can reason about and guarantee their fairness and robustness. Here, we provide a probabilistic reasoning framework to audit and enforce fairness of automated decision-making systems, using classifiers as the main example, while being robust to uncertainties and noise in the distribution.
By integrating hard constraints into neural network outputs, we not only improve the reliability of AI systems but also pave the way for meta-cognitive capabilities that ensure the alignment of predictions with domain-specific knowledge.
This topic has received a lot of attention, however, existing methods either impose the constraints in a “weak” form at training time, with no guarantees at inference, or fail to provide a general framework that supports different tasks and constraint types.
We tackle this open problem from a neuro-symbolic perspective, developing a pipeline that enhances a conventional neural predictor with a symbolic reasoning module capable of correcting structured prediction errors and a neural attention module that learns to direct the reasoning effort to focus on potential prediction errors, while keeping other outputs unchanged.
This framework provides an appealing trade-off between the efficiency of constraint-free neural inference and the prohibitive cost of exhaustive reasoning at inference time that satisfies the rigorous demands of meta-cognitive assurance.
Text-to-image (T2I) diffusion models require large-scale training data to achieve such good performance. Still, they seem to lack a common understanding of semantics such as spatial composition, and spurious correlations raising ethical concerns. Data and model size do not matter in learning better semantics; instead, they seem to hurt the model. Recent works have shown the few-shot concept learning abilities of T2I models on simple concepts like cat or dog. Following the line of research, we introduce in this chapter utilizing Concept Algebra for learning new concepts in a resource-efficient way.
To do that, we introduce three works focusing on concept learning to show its effectiveness: (1) Create a benchmark for large-scale evaluations of concept learning methodologies, (2) Reduce ethical biases via Concept Algebra via few-shot concept learning, and (3) Learn spatial relationships via few-shot concept adaptation. Through this research, we describe the efforts to create few-shot synthetic data that is both robust and reduces biases present in various forms.
AI systems have struggled to be deployed in safety-critical applications where the consequences of incorrect predictions are severe. In complex applications and environments, like autonomous driving, it is often impossible or impractical to curate a dataset or simulator that sufficiently spans the entire input space, making it improbable that a perfect agent can be trained offline. Metacognitive AI represents an approach to design agents that continue safely learning and adapting as they encounter new or uncertain scenarios in the environment, which improves their performance over time. A key component to achieve this behavior is quantifying the AI agent’s prediction uncertainty to enable the agent to understand when it is operating in a previously unseen scenario. In this chapter, we discuss a framework for creating a metacognitive agent and delve deeper into Meta Modeling, a method for augmenting existing neural networks with uncertainty quantification. Our approach provides a first step toward realizing a metacognitive AI agent.
The chapter discusses the critical role of predictive uncertainty and diversity in enhancing the robustness and generalizability of embodied AI and robot learning. It explores the need for robots to efficiently learn and act in the unpredictable physical world by considering diverse scenarios and their consequences. The chapter highlights the importance of distinguishing between evaluative and generative paradigms of uncertainty, emphasizing the need to balance accuracy, uncertainty, and computational complexity in robot models. It examines various sources of uncertainty, including physical and model limitations, partial observability, environment dynamics, and domain shifts. Additionally, it outlines techniques for quantifying uncertainty, such as variance, entropy, and Bayesian methods, and underscores the significance of leveraging uncertainty in decision-making, exploration, and learning robust models. By addressing uncertainty in perception, representation, planning, and control, the chapter aims to improve the reliability and safety of robotic systems in diverse and dynamic environments.
Classification of movement trajectories has many applications in transportation. Supervised neural models represent the current state-of-the-art. Recent security applications require this task to be rapidly employed in environments that may differ from the data used to train such models for which there is little training data. We provide a neuro-symbolic rule-based framework to conduct error correction and detection of these models to support eventual deployment in security applications.
The long game of AI aims at developing agents that are progressively more human-like in an ever-growing number of facets. Such agents must be able to explain the causes and effects of events and attitudes of agents in their world, including their own attitudes. This state of affairs can only be brought about if the agents are endowed with metacognitive abilities. In this chapter, we highlight the importance of metacognition for modeling the phenomenon of trust. Specifically, we present the case for the interdependence of metacognition and mutual trust between members of human-AI teams. We also argue that metacognition based on causality and contentful explanations requires knowledge support modeling human semantic and episodic memories as well as knowledge of language. We illustrate the above point with examples from systems developed using the OntoAgent cognitive architecture.
Currently, there is a gap in the literature regarding effective post-deployment interventions for LLMs. Existing methods like few-shot or zero-shot prompting show promise but lack certainty in post-prompting performance and heavily rely on human expertise for error detection and prompt crafting. Against this backdrop, we trifurcate the challenges for LLM intervention into three folds. First, the ``black-box’’ nature of LLMs obscures the malfunction source within the multitude of parameters, complicating targeted intervention. Second, rectification typically depends on domain experts to identify errors, hindering scalability and automation. Third, the architectural complexity and sheer size of LLMs render pinpointed intervention an overwhelmingly daunting task.
Here, we call for a novel paradigm for LLM intervention inspired by cognitive science principles. This paradigm aims to equip LLMs with self-awareness in error identification and correction, emulating human cognitive efficiency. It would enable LLMs to form transparent decision-making pathways guided by human-comprehensible concepts, allowing for precise model intervention.
Metacognition is the concept of reasoning about an agent’s own internal processes and was originally introduced in the field of developmental psychology. In this position chapter, we examine the concept of applying metacognition to artificial intelligence (AI). We introduce a framework for understanding metacognitive AI that we call TRAP: transparency, reasoning, adaptation, and perception.
The integration of AI into information systems will affect the way users interface with these systems. This exploration of the interaction and collaboration between humans and AI reveals its potential and challenges, covering issues such as data privacy, credibility of results, misinformation, and search interactions. Later chapters delve into application domains such as healthcare and scientific discovery. In addition to providing new perspectives on and methods for developing AI technology and designing more humane and efficient artificial intelligence systems, the book also reveals the shortcomings of artificial intelligence technologies through case studies and puts forward corresponding countermeasures and suggestions. This book is ideal for researchers, students, and industry practitioners interested in enhancing human-centered AI systems and insights for future research.
This groundbreaking volume is designed to meet the burgeoning needs of the research community and industry. This book delves into the critical aspects of AI's self-assessment and decision-making processes, addressing the imperative for safe and reliable AI systems in high-stakes domains such as autonomous driving, aerospace, manufacturing, and military applications. Featuring contributions from leading experts, the book provides comprehensive insights into the integration of metacognition within AI architectures, bridging symbolic reasoning with neural networks, and evaluating learning agents' competency. Key chapters explore assured machine learning, handling AI failures through metacognitive strategies, and practical applications across various sectors. Covering theoretical foundations and numerous practical examples, this volume serves as an invaluable resource for researchers, educators, and industry professionals interested in fostering transparency and enhancing reliability of AI systems.