Skip to main content Accessibility help
×
Hostname: page-component-54dcc4c588-m259h Total loading time: 0 Render date: 2025-09-28T11:14:34.499Z Has data issue: false hasContentIssue false

Chapter 22 - Neural Mechanisms of Arbitration between Multiple Experts in Value-Based Decision-Making

from Section V - Cognition–Emotion Interactions

Published online by Cambridge University Press:  16 September 2025

Jorge Armony
Affiliation:
McGill University, Montréal
Patrik Vuilleumier
Affiliation:
University of Geneva
Get access

Summary

The brain faces an array of behavioral control challenges varying in complexity, abstraction, and temporal scale. Leveraging multiple decision-making strategies offers a clear advantage, allowing for adaptability to different contexts. Even when solving a single problem, the selection from or combination of different strategies can enhance the likelihood of success. Consequently, the brain faces the critical task of arbitrating between experts effectively. Here, we review theories of multiple controllers in value-driven decision-making, the mechanisms of arbitration between them, and the neural correlates of such processes. Although these theories have provided meaningful explanations for observed behavior and neural activity, fundamental questions persist regarding the precise nature of these controllers, their interactions, and their neural underpinnings. Notably, the role of subjective states in these computations has been largely overlooked, despite their obvious importance in the experience of making decisions.

Information

Type
Chapter
Information
Publisher: Cambridge University Press
Print publication year: 2025

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Book purchase

Temporarily unavailable

References

Adams, C. D. (1982). Variations in the sensitivity of instrumental responding to reinforcer devaluation. The Quarterly Journal of Experimental Psychology, 34, 77–98.CrossRefGoogle Scholar
Adams, C. D., & Dickinson, A. (1981). Instrumental responding following reinforcer devaluation. The Quarterly Journal of Experimental Psychology Section B, 33, 109–121.CrossRefGoogle Scholar
Averbeck, B. B., & Duchaine, B. (2009). Integration of social and utilitarian factors in decision making. Emotion, 9, 599–608.CrossRefGoogle ScholarPubMed
Balleine, B. W., & O’Doherty, J. P. (2010). Human and rodent homologies in action control: corticostriatal determinants of goal-directed and habitual action. Neuropsychopharmacology, 35, 48–69.CrossRefGoogle ScholarPubMed
Beierholm, U. R., Anen, C., Quartz, S., & Bossaerts, P. (2011). Separate encoding of model-based and model-free valuations in the human brain. NeuroImage, 58, 955–962.CrossRefGoogle ScholarPubMed
Bennett, D., Davidson, G., & Niv, Y. (2022). A model of mood as integrated advantage. Psychological Review, 129, 513–541.CrossRefGoogle Scholar
Berridge, K. C., & Robinson, T. E. (2016). Liking, wanting, and the incentive-sensitization theory of addiction. American Psychologist, 71, 670–679.CrossRefGoogle ScholarPubMed
Botvinick, M., & Weinstein, A. (2014). Model-based hierarchical reinforcement learning and human action control. Philosophical Transactions of the Royal Society B: Biological Sciences, 369, 20130480.CrossRefGoogle ScholarPubMed
Brown, V. M., Chen, J., Gillan, C. M., & Price, R. B. (2020). Improving the reliability of computational analyses: Model-based planning and its relationship with compulsivity. Biological Psychiatry: Cognitive Neuroscience and Neuroimaging, 5, 601–609.Google ScholarPubMed
Burke, C. J., Tobler, P. N., Baddeley, M., & Schultz, W. (2010). Neural mechanisms of observational learning. Proceedings of the National Academy of Sciences of the United States of America, 107, 14431–14436.Google ScholarPubMed
Camerer, C. F., & Li, X. (2021). Neural autopilot and context-sensitivity of habits. Current Opinion in Behavioral Sciences, 41, 185–190.CrossRefGoogle Scholar
Charpentier, C. J., Iigaya, K., & O’Doherty, J. P. (2020). A neuro-computational account of arbitration between choice imitation and goal emulation during human observational learning. Neuron, 106, 687–699.CrossRefGoogle ScholarPubMed
Charpentier, C. J., & O’Doherty, J. P. (2021). Computational approaches to mentalizing during observational learning and strategic social interactions. In Gilead, M. & Ochsner, K. N. (Eds.), The neural basis of mentalizing (pp. 489–501). Springer.Google Scholar
Chen, C., Takahashi, T., Nakagawa, S., Inoue, T., & Kusumi, I. (2015). Reinforcement learning in depression: A review of computational research. Neuroscience Biobehavioral Reviews, 55, 247–267.CrossRefGoogle ScholarPubMed
Collette, S., Pauli, W. M., Bossaerts, P., & O’Doherty, J. (2017). Neural computations underlying inverse reinforcement learning in the human brain. eLife, 6, e29718.CrossRefGoogle ScholarPubMed
Collins, A. G. E., & Cockburn, J. (2020). Beyond dichotomies in reinforcement learning. Nature Reviews Neuroscience, 21, 576–586.CrossRefGoogle ScholarPubMed
Collins, A. G. E, & Shenhav, A. (2022). Advances in modeling learning and decision-making in neuroscience. Neuropsychopharmacology, 47, 104–118.CrossRefGoogle ScholarPubMed
Cooper, J. C., Dunne, S., Furey, T., & O’Doherty, J. P. (2017). Human dorsal striatum encodes prediction errors during observational learning of instrumental actions. Journal of Cognitive Neuroscience, 24, 106–118.Google Scholar
Cushman, F., & Morris, A. (2015). Habitual control of goal selection in humans. Proceedings of the National Academy of Sciences of the United States of America, 112, 13817–13822.Google ScholarPubMed
Daw, N. D., & Dayan, P. (2014). The algorithmic anatomy of model-based evaluation. Philosophical Transactions of the Royal Society B: Biological Sciences, 369, 20130478.CrossRefGoogle ScholarPubMed
Daw, N. D., Gershman, S. J., Seymour, B., Dayan, P., & Dolan, R. J. (2011). Model-based influences on humans’ choices and striatal prediction errors. Neuron, 69, 1204–1215.CrossRefGoogle ScholarPubMed
Daw, N. D., Niv, Y., & Dayan, P. (2005). Uncertainty-based competition between prefrontal and dorsolateral striatal systems for behavioral control. Nature Neuroscience, 8, 1704–1711.CrossRefGoogle ScholarPubMed
Daw, N. D., O’Doherty, J. P., Dayan, P., Seymour, B., & Dolan, R. J. (2006). Cortical substrates for exploratory decisions in humans. Nature, 441, 876–879.CrossRefGoogle ScholarPubMed
Deserno, L., Huys, Q. J. M., Boehme, R., Buchert, R., Heinze, H. J., Grace, A. A., … Schlagenhauf, F. (2015). Ventral striatal dopamine reflects behavioral and neural signatures of model-based control during sequential decision making. Proceedings of the National Academy of Sciences of the United States of America, 112, 1595–1600.Google ScholarPubMed
Dickinson, A. (1985). Actions and habits: The development of behavioural autonomy. Philosophical Transactions of the Royal Society of London B, Biological Sciences, 308, 67–78.Google Scholar
Doll, B. B., Duncan, K. D., Simon, D. A., Shohamy, D., & Daw, N. D. (2015). Model-based choices involve prospective neural activity. Nature Neuroscience, 18, 767–772.CrossRefGoogle ScholarPubMed
Doll, B. B., Simon, D. A., & Daw, N. D. (2012). The ubiquity of model-based reinforcement learning. Current Opinion in Neurobiology, 22, 1075–1081.CrossRefGoogle ScholarPubMed
Eldar, E., & Niv, Y. (2015). Interaction between emotional state and learning underlies mood instability. Nature Communications, 6, 6149.CrossRefGoogle ScholarPubMed
Eldar, E., Rutledge, R. B., Dolan, R. J., & Niv, Y. (2016). Mood as representation of momentum. Trends in Cognitive Sciences, 20, 15–24.CrossRefGoogle ScholarPubMed
Fanselow, M. S., & Wassum, K. M. (2016). The origins and organization of vertebrate Pavlovian conditioning. Cold Spring Harbor Perspectives in Biology, 8, a021717.CrossRefGoogle Scholar
Fetter, M. (2007). Vestibulo-ocular reflex. Neuro-Ophthalmology, 40, 35–51.CrossRefGoogle ScholarPubMed
Galton, F. (1907). Vox populi. Nature, 75, 450–451.CrossRefGoogle Scholar
Gera, R., Or, M. B., Tavor, I., Roll, D., Cockburn, J., Barak, S., … Schonberg, T. (2023). Characterizing habit learning in the human brain at the individual and group levels: A multi-modal MRI study. NeuroImage, 272, 120002.CrossRefGoogle Scholar
Gershman, S. J., Markman, A. B., & Otto, A. R. (2014). Retrospective revaluation in sequential decision making: A tale of two systems. Journal of Experimental Psychology: General, 143, 182–194.Google ScholarPubMed
Gläscher, J., Daw, N., Dayan, P., & O’Doherty, J. P. (2010). States versus rewards: Dissociable neural prediction error signals underlying model-based and model-free reinforcement learning. Neuron, 66, 585–595.CrossRefGoogle ScholarPubMed
Hampton, A. N., Bossaerts, P., & O’Doherty, J. P. (2006). The role of the ventromedial prefrontal cortex in abstract state-based inference during decision making in humans. Journal of Neuroscience, 26, 8360–8367.CrossRefGoogle ScholarPubMed
Hill, M. R., Boorman, E. D., & Fried, I. (2016). Observational learning computations in neurons of the human anterior cingulate cortex. Nature Communications, 7, 1272.CrossRefGoogle ScholarPubMed
Huang, Y., Yaple, Z. A., & Yu, R. (2020). Goal-oriented and habitual decisions: Neural signatures of model-based and model-free learning. NeuroImage, 215, 116834.CrossRefGoogle ScholarPubMed
Kahneman, D. (2011). Thinking, fast and slow. Macmillan.Google Scholar
Keramati, M., Dezfouli, A., & Piray, P. (2011). Speed/accuracy trade-off between the habitual and the goal-directed processes. PLoS Computational Biology, 7, e1002055.CrossRefGoogle ScholarPubMed
Keramati, M., Smittenaar, P., Dolan, R. J., & Dayan, P. (2016). Adaptive integration of habits into depth-limited planning defines a habitual-goal-directed spectrum. Proceedings of the National Academy of Sciences of the United States of America, 113, 12868–12873.Google ScholarPubMed
Kim, D., Park, G. Y., O’Doherty, J. P., & Lee, S. W. (2019). Task complexity interacts with state-space uncertainty in the arbitration between model-based and model-free learning. Nature Communications, 10, 5738.CrossRefGoogle ScholarPubMed
Kool, W., Cushman, F. A., & Gershman, S. J. (2018). Competition and cooperation between multiple reinforcement learning systems. In Morris, R., Bornstein, A., & Shenhav, A. (Eds.), Goal-directed decision making (pp. 153–178). Elsevier Academic Press.Google Scholar
Kool, W., Gershman, S. J., & Cushman, F. A. (2017). Cost-benefit arbitration between multiple reinforcement-learning systems. Psychological Science, 28, 1321–1333.CrossRefGoogle ScholarPubMed
Korn, C. W., & Bach, D. R. (2018). Heuristic and optimal policy computations in the human brain during sequential decision- making. Nature Communications, 9, 325.CrossRefGoogle ScholarPubMed
Kroemer, N. B., Lee, Y., Pooseh, S., Eppinger, B., Goschke, T., & Smolka, M. N. (2019). L-DOPA reduces model-free control of behavior by attenuating the transfer of value to action. NeuroImage, 186, 113–125.CrossRefGoogle ScholarPubMed
Kumar, P., Goer, F., Murray, L., Dillon, D. G., Beltzer, M. L., Cohen, A. L., … Pizzagalli, D. A. (2018). Impaired reward prediction error encoding and striatal-midbrain connectivity in depression. Neuropsychopharmacology, 43, 1581–1588.CrossRefGoogle ScholarPubMed
Laird, J. E., Rosenbloom, P. S., & Newell, A. (1986). Chunking in soar: The anatomy of a general learning mechanism. Machine Learning, 1, 11–46.CrossRefGoogle Scholar
Lee, S. W., Shimojo, S., & O’Doherty, J. P. (2014). Neural computations underlying arbitration between model-based and model-free learning. Neuron, 81, 687–699.CrossRefGoogle ScholarPubMed
Ligneul, R., Mainen, Z. F., Ly, V., & Cools, R. (2022). Stress-sensitive inference of task controllability. Nature Human Behaviour, 6, 812–822.CrossRefGoogle ScholarPubMed
Maier, S. F., & Seligman, M. E. P. (2016). Learned helplessness at fifty: Insights from neuroscience. Psychological Review, 123, 349–367.CrossRefGoogle ScholarPubMed
McClure, S. M., Berns, G. S., & Montague, P. R. (2003). Temporal prediction errors in a passive learning task activate human striatum. Neuron, 38, 339–346.CrossRefGoogle Scholar
McGovern, A., & Barto, A. G. (2001). “Automatic discovery of subgoals in reinforcement learning using diverse density,” in Proceedings of the Eighteenth International Conference on Machine Learning (ICML 2001). Morgan Kaufmann, pp. 361–368.Google Scholar
McNamee, D., Liljeholm, M., Zika, O., & O’Doherty, J. P. (2015). Characterizing the associative content of brain structures involved in habitual and goal-directed actions in humans: A multivariate FMRI study. Journal of Neuroscience, 5, 3764–3771.Google Scholar
Miller, K. J., Ludvig, E. A., Pezzulo, G., & Shenhav, A. (2018). Realigning models of habitual and goal-directed decision-making. In Morris, R., Bornstein, A, & Shenhav, A. (Eds.), Goal-directed decision making: Computations and neural circuits (pp. 407–428). Elsevier Academic Press.Google Scholar
Moskovitz, T., Miller, K., Sahani, M., & Botvinick, M. M. (2024). Understanding dual process cognition via the minimum description length principle. PLoS Computational Biology, 20, e1012383.CrossRefGoogle ScholarPubMed
Niv, Y., & Schoenbaum, G. (2008). Dialogues on prediction errors. Trends in Cognitive Sciences, 12, 265–272.CrossRefGoogle ScholarPubMed
Norman, D. A., & Shallice, T. (1986). Attention to action: Willed and automatic control of behavior. In Davidson, R. J., Schwarts, G. E., & Shapiro, D. (Eds.), Consciousness and self-regulation, vol. 4 (pp. 1–18). Springer.Google Scholar
O’Doherty, J. P. (2016). Multiple systems for the motivational control of behavior and associated neural substrates in humans. In Simpson, E. H. & Balsam, P. D. (Eds.), Behavioral Neuroscience of Motivation, (pp. 291–312). Springer.Google Scholar
O’Doherty, J., Dayan, P., Schultz, J., Deichmann, R., Friston, K., & Dolan, R. J. (2004). Dissociable roles of ventral and dorsal striatum in instrumental conditioning. Science, 304, 452–454.Google ScholarPubMed
O’Doherty, J., Kringelbach, M. L., Rolls, E. T., Hornak, J., & Andrews, C. (2001). Abstract reward and punishment representations in the human orbitofrontal cortex. Nature Neuroscience, 4, 95–102.Google ScholarPubMed
O’Doherty, J. P., Lee, S. W., Tadayonnejad, R., Cockburn, J., Iigaya, K., & Charpentier, C. J. (2021). Why and how the brain weights contributions from a mixture of experts. Neuroscience Biobehavioral Reviews, 123, 14–23.CrossRefGoogle ScholarPubMed
Pauli, W. M., Gentile, G., Collette, S., Tyszka, J. M., & O’Doherty, J. P. (2019). Evidence for model-based encoding of Pavlovian contingencies in the human brain. Nature Communications, 10, 1099.CrossRefGoogle ScholarPubMed
Pearson, J. M., Hayden, B. Y., & Platt, M. L. (2010). Explicit information reduces discounting behavior in monkeys. Frontiers in Psychology, 1, 237.CrossRefGoogle ScholarPubMed
Pezzulo, G., Rigoli, F., & Chersi, F. (2013). The mixed instrumental controller: Using value of information to combine habitual choice and mental simulation. Frontiers in Psychology, 4, 92.CrossRefGoogle ScholarPubMed
Pezzulo, G., Rigoli, F., & Friston, K. J. (2018). Hierarchical active inference: A theory of motivated control. Trends in Cognitive Sciences, 22, 294–306.CrossRefGoogle ScholarPubMed
Phelps, E. A., Lempert, K. M., & Sokol-Hessner, P. (2014). Emotion and decision making: Multiple modulatory neural circuits. Annual Review of Neuroscience, 37, 263–287.CrossRefGoogle ScholarPubMed
Philippe, R., Janet, R., Khalvati, K., Rao, R. P. N., Lee, D., & Dreher, J. C. C. (2024). Neurocomputational mechanisms involved in adaptation to fluctuating intentions of others. Nature Communications, 15, 3189.CrossRefGoogle ScholarPubMed
Pool, E. R., Pauli, W. M., Kress, C. S., & O’Doherty, J. P. (2019) Behavioural evidence for parallel outcome-sensitive and outcome- insensitive Pavlovian learning systems in humans. Nature Human Behaviour, 3, 284–296.CrossRefGoogle ScholarPubMed
Reeve, C. D. C. (2014). Nicomachean ethics. Hackett Publishing.Google Scholar
Rolls, E. T. (1990). A theory of emotion, and its application to understanding the neural basis of emotion. Cognition Emotion, 4, 161–190.CrossRefGoogle Scholar
Rutledge, R. B., Skandali, N., Dayan, P., & Dolan, R. J. (2014). A computational and neural model of momentary subjective well-being. Proceedings of the National Academy of Sciences of the United States of America, 111, 12252–12257.Google ScholarPubMed
Sander, D. (2013). Models of emotion: The affective neuroscience approach. In Armony, J. L. & Vuilleumier, P. (Eds.), The Cambridge handbook of human affective neuroscience (pp. 5–56). Cambridge University Press.Google Scholar
Scherer, K. R. (2005). What are emotions? And how can they be measured? Social Science Information, 44, 695–729.CrossRefGoogle Scholar
Schneider, W., & Shiffrin, R. M. (1977). Controlled and automatic human information processing: I. Detection, search, and attention. Psychological Review, 84, 1–66.CrossRefGoogle Scholar
Seok, D., Tadayonnejad, R., Wong, W.-W., O’Neill, J., Cockburn, J., Bari, A. A., … Feusner, J. D. (2022). Neurocircuit dynamics of arbitration between decision-making strategies across obsessive-compulsive and related disorders. NeuroImage: Clinical, 35, 103073.Google ScholarPubMed
Seymour, B., & Dolan, R. (2008). Emotion, decision making, and the amygdala. Neuron, 58, 662–671.CrossRefGoogle ScholarPubMed
Simon, D. A., & Daw, N. D. (2011). Neural correlates of forward planning in a spatial decision task in humans. Journal of Neuroscience, 31, 5526–5539.CrossRefGoogle Scholar
Surowiecki, J. (2005). The wisdom of crowds. Anchor.Google Scholar
Sutton, R. S. (1990). Integrated architectures for learning, planning, and reacting based on approximating dynamic programming. In Porter, B. & Mooney, R. (Eds.), Machine learning proceedings (pp. 216–224). Morgan Kaufmann.Google Scholar
Sutton, R. S., Precup, D., & Singh, S. (1999). Between MDPs and semi-MDPs: A framework for temporal abstraction in reinforcement learning. Artificial Intelligence, 112, 181–211.CrossRefGoogle Scholar
Tricomi, E., Balleine, B. W., & O’Doherty, J. P. (2009). A specific role for posterior dorsolateral striatum in human habit learning. European Journal of Neuroscience, 29, 2225–2232.CrossRefGoogle Scholar
Valentin, V. V., Dickinson, A., & O’Doherty, J. P. (2007). Determining the neural substrates of goal-directed learning in the human brain. Journal of Neuroscience, 27, 4019–4026.CrossRefGoogle ScholarPubMed
Wagner, A. R., & Rescorla, R. A. (1972). Inhibition in Pavlovian conditioning: Application of a theory. In Boakes, R. A. & Halliday, M. S. (Eds.), Inhibition and learning (pp. 301–336). Academic Press.Google Scholar
Wimmer, G. E., Daw, N. D., & Shohamy, D. (2012). Generalization of value in reinforcement learning by humans. European Journal of Neuroscience, 35, 1092–1104.CrossRefGoogle ScholarPubMed
Winkielman, P., Berridge, K. C., & Wilbarger, J. L. (2005). Unconscious affective reactions to masked happy versus angry faces influence consumption behavior and judgments of value. Personality and Social Psychology Bulletin, 31, 121–135.CrossRefGoogle ScholarPubMed
Wunderlich, K., Dayan, P., & Dolan, R. J. (2012). Mapping value based planning and extensively trained choice in the human brain. Nature Neuroscience, 15, 786–791.CrossRefGoogle ScholarPubMed
Wunderlich, K., Rangel, A., & O’Doherty, J. P. (2009). Neural computations underlying action-based decision making in the human brain. Proceedings of the National Academy of Sciences of the United States of America, 106, 17199–17204.Google ScholarPubMed
Yin, H. H., & Knowlton, B. J. (2006). The role of the basal ganglia in habit formation. Nature Reviews Neuroscience, 7, 464–476.CrossRefGoogle ScholarPubMed

Accessibility standard: WCAG 2.0 A

The PDF of this book conforms to version 2.0 of the Web Content Accessibility Guidelines (WCAG), ensuring core accessibility principles are addressed and meets the basic (A) level of WCAG compliance, addressing essential accessibility barriers.

Content Navigation

Table of contents navigation
Allows you to navigate directly to chapters, sections, or non‐text items through a linked table of contents, reducing the need for extensive scrolling.
Index navigation
Provides an interactive index, letting you go straight to where a term or subject appears in the text without manual searching.

Reading Order & Textual Equivalents

Single logical reading order
You will encounter all content (including footnotes, captions, etc.) in a clear, sequential flow, making it easier to follow with assistive tools like screen readers.
Short alternative textual descriptions
You get concise descriptions (for images, charts, or media clips), ensuring you do not miss crucial information when visual or audio elements are not accessible.
Full alternative textual descriptions
You get more than just short alt text: you have comprehensive text equivalents, transcripts, captions, or audio descriptions for substantial non‐text content, which is especially helpful for complex visuals or multimedia.

Visual Accessibility

Use of high contrast between text and background colour
You benefit from high‐contrast text, which improves legibility if you have low vision or if you are reading in less‐than‐ideal lighting conditions.

Structural and Technical Features

ARIA roles provided
You gain clarity from ARIA (Accessible Rich Internet Applications) roles and attributes, as they help assistive technologies interpret how each part of the content functions.

Save book to Kindle

To save this book to your Kindle, first ensure no-reply@cambridge.org is added to your Approved Personal Document E-mail List under your Personal Document Settings on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part of your Kindle email address below. Find out more about saving to your Kindle.

Note you can select to save to either the @free.kindle.com or @kindle.com variations. ‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi. ‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.

Find out more about the Kindle Personal Document Service.

Available formats
×

Save book to Dropbox

To save content items to your account, please confirm that you agree to abide by our usage policies. If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account. Find out more about saving content to Dropbox.

Available formats
×

Save book to Google Drive

To save content items to your account, please confirm that you agree to abide by our usage policies. If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account. Find out more about saving content to Google Drive.

Available formats
×