Skip to main content Accessibility help
×
Hostname: page-component-848d4c4894-x5gtn Total loading time: 0 Render date: 2024-06-09T23:05:39.349Z Has data issue: false hasContentIssue false

13 - Hierarchically organised behaviour and its neural foundations: a reinforcement-learning perspective

from Part II - Computational neuroscience models

Published online by Cambridge University Press:  05 November 2011

Anil K. Seth
Affiliation:
University of Sussex
Tony J. Prescott
Affiliation:
University of Sheffield
Joanna J. Bryson
Affiliation:
University of Bath
Get access

Summary

Summary

Research on human and animal behaviour has long emphasised its hierarchical structure – the divisibility of ongoing behaviour into discrete tasks, which are comprised of subtask sequences, which in turn are built of simple actions. The hierarchical structure of behaviour has also been of enduring interest within neuroscience, where it has been widely considered to reflect prefrontal cortical functions. In this chapter, we re-examine behavioural hierarchy and its neural substrates from the point of view of recent developments in computational reinforcement learning. Specifically, we consider a set of approaches known collectively as hierarchical reinforcement learning, which extend the reinforcement learning paradigm by allowing the learning agent to aggregate actions into reusable subroutines or skills. A close look at the components of hierarchical reinforcement learning suggests how they might map onto neural structures, in particular regions within the dorsolateral and orbital prefrontal cortex. It also suggests specific ways in which hierarchical reinforcement learning might provide a complement to existing psychological models of hierarchically structured behaviour. A particularly important question that hierarchical reinforcement learning brings to the fore is that of how learning identifies new action routines that are likely to provide useful building blocks in solving a wide range of future problems. Here and at many other points, hierarchical reinforcement learning offers an appealing framework for investigating the computational and neural underpinnings of hierarchically structured behaviour.

In recent years, it has become increasingly common within both psychology and neuroscience to explore the applicability of ideas from machine learning. Indeed, one can now cite numerous instances where this strategy has been fruitful. Arguably, however, no area of machine learning has had as profound and sustained an impact on psychology and neuroscience as that of computational reinforcement learning (RL). The impact of RL was initially felt in research on classical and instrumental conditioning (Barto and Sutton, 1981; Sutton and Barto, 1990; Wickens et al., 1995). Soon thereafter, its impact extended to research on midbrain dopaminergic function, where the temporal-difference learning paradigm provided a framework for interpreting temporal profiles of dopaminergic activity (Barto, 1995; Houk et al., 1995; Montague et al., 1996; Schultz et al., 1997). Subsequently, actor–critic architectures for RL have inspired new interpretations of functional divisions of labour within the basal ganglia and cerebral cortex (see Joel et al., 2002, for a review), and RL-based accounts have been advanced to address issues as diverse as motor control (e.g., Miyamoto et al., 2004), working memory (e.g., O’Reilly and Frank, 2006), performance monitoring (e.g., Holroyd and Coles, 2002), and the distinction between habitual and goal-directed behaviour (e.g., Daw et al., 2005).

Type
Chapter
Information
Publisher: Cambridge University Press
Print publication year: 2011

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

Agre, P. E 1988
Aldridge, W. J.Berridge, K. C. 1998 Coding of serial order by neostriatal neurons: a ‘natural action’ approach to movement sequenceJ. Neurosci. 18 2777CrossRefGoogle Scholar
Aldridge, J. W.Berridge, K. C.Rosen, A. R. 2004 Basal ganglia neural mechanisms of natural movement sequencesCan. J. Physiol. Pharmacol 82 732CrossRefGoogle ScholarPubMed
Alexander, G. E.Crutcher, M. D.DeLong, M. R. 1990 Basal ganglia-thalamocortical circuits: parallel substrates for motor, oculomotor, ‘prefrontal’ and ‘limbic’ functionsProg. Brain Res 85 119CrossRefGoogle ScholarPubMed
Alexander, G. E.DeLong, M. R.Strick, P. L. 1986 Parallel organization of functionally segregated circuits linking basal ganglia and cortexAnnu. Rev. Neurosci. 9 357CrossRefGoogle ScholarPubMed
Allport, A.Wylie, G 2000 Task-switching, stimulus-response bindings and negative primingControl of Cognitive Processes: Attention and Performance, XVIIIMonsell, S.Driver, J.Cambridge, MAMIT Press35Google Scholar
Anderson, J. R. 2004 An integrated theory of mindPsychol. Rev.1036CrossRefGoogle Scholar
Andre, D.Russell, S. J. 2001 Programmable reinforcement learning agentsAdv. Neural Inf. Proc. Syst. 13 1019Google Scholar
Andre, D.Russell, S. J. 2002
Ansuini, C.Santello, M.Massaccesi, S.Castiello, U. 2006 Effects of end-goal on hand shapingJ. Neurophysiol. 95 2456CrossRefGoogle ScholarPubMed
Arbib, M. A. 1985 Schemas for the temporal organization of behaviourHum. Neurobiol 4 63Google ScholarPubMed
Asaad, W. F.Rainer, G.Miller, E. K. 2000 Task-specific neural activity in the primate prefrontal cortexJ. Neurophysiol 84 451CrossRefGoogle ScholarPubMed
Averbeck, B. B.Lee, D. 2007 Prefrontal neural correlates of memory for sequencesJ. Neurosci 27 2204CrossRefGoogle ScholarPubMed
Badre, D. 2008 Cognitive control, hierarchy, and the rostro–caudal organization of the frontal lobesTrends Cogn. Sci. 12 193CrossRefGoogle ScholarPubMed
Balleine, B. W.Dickinson, A. 1998 Goal-directed instrumental action: contingency and incentive learning and their cortical substratesNeuropharmacology 37 407CrossRefGoogle ScholarPubMed
Barto, A. G. 1995 Adaptive critics and the basal gangliaModels of Information Processing in the Basal GangliaHouk, J. C.Davis, J.Beiser, D.Cambridge, MAMIT Press215Google Scholar
Barto, A. G.Mahadevan, S. 2003 Recent advances in hierarchical reinforcement learningDiscrete Event Dyn. S. 13 343Google Scholar
Barto, A. G.Singh, S.Chentanez, N. 2004
Barto, A. G.Sutton, R. S. 1981 Toward a modern theory of adaptive networks: Expectation and predictionPsychol. Rev 88 135Google Scholar
Barto, A. G.Sutton, R. SAnderson, C. W. 1983 Neuronlike adaptive elements that can solve difficult learning control problemsIEEE T. Syst. Man and Cyb. 13 834CrossRefGoogle Scholar
Berlyne, D. E. 1960 Conflict, Arousal and CuriosityNew YorkMcGraw-HillCrossRefGoogle Scholar
Bhatnagara, S.Panigrahi, J. R. 2006 Actor-critic algorithms for hierarchical Markov decision processesAutomatica 42 637CrossRefGoogle Scholar
Bogacz, R.Brown, E.Moehlis, J.Holmes, P.Cohen, J. D. 2006 The physics of optimal decision making: a formal analysis of models of performance in two-alternative forced-choice tasksPsychol. Rev. 113 700CrossRefGoogle ScholarPubMed
Bor, D.Duncan, J.Wiseman, R. J.Owen, A. M. 2003 Encoding strategies dissociate prefrontal activity from working memory demandNeuron 37 361CrossRefGoogle ScholarPubMed
Botvinick, M.Plaut, D. C. 2002 Representing task context: proposals based on a connectionist model of actionPsychol. Res.298CrossRefGoogle Scholar
Botvinick, M.Plaut, D. C. 2004 Doing without schema hierarchies: a recurrent connectionist approach to normal and impaired routine sequential actionPsychol. Rev. 111 395CrossRefGoogle ScholarPubMed
Botvinick, M.Plaut, D. C. 2006 Such stuff as habits are made on: a reply to Cooper and Shallice (2006)Psychol. Rev. 113 917CrossRefGoogle Scholar
Botvinick, M. M. 2007 Multilevel structure in behaviour and the brain: a model of Fuster's hierarchyPhil. Trans. Roy. Soc. B 362 1615CrossRefGoogle ScholarPubMed
Botvinick, M. M. 2008 Hierarchical models of behavior and prefrontal functionTrends Cogn. Sci. 12 201CrossRefGoogle ScholarPubMed
Bruner, J. 1973 Organization of early skilled actionChild Dev. 44 1CrossRefGoogle ScholarPubMed
Bunge, S. A. 2004 How we use rules to select actions: a review of evidence from cognitive neuroscienceCogn. Affect. Behav. Ne. 4 564CrossRefGoogle ScholarPubMed
Bunzeck, N.Duzel, E. 2006 Absolute coding of stimulus novelty in the human substantia nigra/VTANeuron 51 369CrossRefGoogle ScholarPubMed
Cohen, J. D.Braver, T. S.O’Reilly, R. C 1996 A computational approach to prefrontal cortex, cognitive control and schizophrenia: recent developments and current challengesPhil. Trans. Roy. Soc. B 351 1515CrossRefGoogle ScholarPubMed
Cohen, J. D.Dunbar, K.McClelland, J. L. 1990 On the control of automatic processes: a parallel distributed processing account of the Stroop effectPsychol. Rev. 97 332CrossRefGoogle ScholarPubMed
Conway, C. M.Christiansen, M. H. 2001 Sequential learning in non-human primatesTrends Cogn. Sci. 5 539CrossRefGoogle ScholarPubMed
Cooper, R.Shallice, T. 2000 Contention scheduling and the control of routine activitiesCogn. Neuropsychol. 17 297CrossRefGoogle ScholarPubMed
Courtney, S. M.Roth, J. K.Sala, J. B.A hierarchical biased-competition model of domain-dependent working memory maintenance and executive controlOsaka, N.Logie, R.Esposito, M. DWorking Memory: Behavioural and Neural CorrelatesOxfordOxford University Press369
D’Esposito, M. 2007 From cognitive to neural models of working memoryPhil. Trans. Roy. Soc. B 362 761CrossRefGoogle ScholarPubMed
Daw, N. D.Courville, A. C.Touretzky, D. S. 2003 Timing and partial observability in the dopamine systemAdvances in Neural Information Processing SystemsCambridge, MAMIT Press99Google Scholar
Daw, N. D.Niv, Y.Dayan, P. 2005 Uncertainty-based competition between prefrontal and striatal systems for behavioral controlNat. Neurosci. 8 1704CrossRefGoogle ScholarPubMed
Daw, N. D.Niv, Y.Dayan, P. 2006 Actions, policies, values and the basal gangliaRecent Breakthroughs in Basal Ganglia ResearchBezard, E.New YorkNova Science Publishers369Google Scholar
De Pisapia, NGoddard, N. H. 2003 A neural model of frontostriatal interactions for behavioral planning and action chunkingNeurocomputing 52Google Scholar
Dehaene, S.Changeux, J.-P. 1997 A hierarchical neuronal network for planning behaviorProc. Nat. Acad. Sci. 94 13293CrossRefGoogle ScholarPubMed
Dell, G. S.Berger, L. K.Svec, W. R. 1997 Language production and serial orderPsychol. Rev 104 123CrossRefGoogle ScholarPubMed
Dietterich, T. G. 1998
Dietterich, T. G. 2000 Hierarchical reinforcement learning with the maxq value function decompositionJ. Artif. Intell. Res. 13 227Google Scholar
Elfwing, S.Uchibe, K.Christensen, H. I. 2007 Evolutionary development of hierarchical learning structuresIEEE Trans. Evol. Comput. 11 249CrossRefGoogle Scholar
Estes, W. K. 1972 An associative basis for coding and organization in memoryCoding Processes in Human MemoryMelton, A. WMartin, E.Washington DCV. H. Winston and Sons161Google Scholar
Ribas-Fernandes, JA. Solway, C. Diuk
Fischer, K. W. 1980 A theory of cognitive development: the control and construction of hierarchies of skillsPsychol. Rev. 87 477CrossRefGoogle Scholar
Fischer, K. W.Connell, M. W. 2003 Two motivational systems that shape development: epistemic and self-organizingB. J. Educ. Psychol. 2 103Google Scholar
Frank, M. J.Claus, E. D. 2006 Anatomy of a decision: striato-orbitofrontal interactions in reinforcement learning, decision making, and reversalPsychol. Rev. 113 300CrossRefGoogle ScholarPubMed
Fujii, N.Graybiel, A. M. 2003 Representation of action sequence boundaries by macaque prefrontal cortical neuronsScience 301 1246CrossRefGoogle ScholarPubMed
Fuster, J. M. 1997 The Prefrontal Cortex: Anatomy, Physiology, and Neuropsychology of the Frontal LobePhiladelphia, PALippincott-Raven.Google Scholar
Fuster, J. M. 2001 The prefrontal cortex – an update: time is of the essenceNeuron 30 319CrossRefGoogle ScholarPubMed
Fuster, J. M. 2004 Upper processing stages of the perception-action cycleTrends Cogn. Sci 8 143CrossRefGoogle ScholarPubMed
Gergely, G.Csibra, G. 2003 Teleological reasoning in infancy: the naive theory of rational actionTrends Cogn. Sci. 7 287CrossRefGoogle Scholar
Gopnik, A.Glymour, C.Sobel, D. 2004 A theory of causal learning in children: causal maps and Bayes netsPsychol. Rev. 111 1CrossRefGoogle ScholarPubMed
Gopnik, A.Schulz, L. 2004 Mechanisms of theory formation in young childrenTrends Cogn. Sci. 8 371CrossRefGoogle ScholarPubMed
Grafman, J. 2002 The human prefrontal cortex has evolved to represent components of structured event complexesGrafman, J.Handbook of NeuropsychologyAmsterdamElsevier157Google Scholar
Graybiel, A. M. 1995 Building action repertoires: memory and learning functions of the basal gangliaCurr. Opin. Neurobiol 5 733CrossRefGoogle ScholarPubMed
Graybiel, A. M. 1998 The basal ganglia and chunking of action repertoiresNeurobiol. Learn. Mem 70 119CrossRefGoogle ScholarPubMed
Greenfield, P. M. 1984 A theory of the teacher in the learning activities of everyday lifeEveryday Cognition: Its Development in Social ContextRogoff, B.Lave, J.Cambridge, MAHarvard University Press117Google Scholar
Greenfield, P. M.Nelson, K.Saltzman, E. 1972 The development of rulebound strategies for manipulating seriated cups: a parallel between action and grammarCogn. Psychol. 3 291CrossRefGoogle Scholar
Greenfield, P. M.Schneider, L. 1977 Building a tree structure: the development of hierarchical complexity and interrupted strategies in children's construction activityDev. Psychol. 13 299CrossRefGoogle Scholar
Grossberg, S. 1986 The adaptive self-organization of serial order in behavior: speech, language, and motor controlPattern Recognition by Humans and Machines, Volume 1: Speech, PerceptionSchwab, E. CNusbaum, H. C.New YorkAcademic Press187CrossRefGoogle Scholar
Hamilton, A. F. d. CGrafton, S. T. 2008 Action outcomes are represented in human inferior frontoparietal cortexCereb. Cortex 18 1160CrossRefGoogle ScholarPubMed
Harlow, H. F.Harlow, M. K.Meyer, D. R. 1950 Learning motivated by a manipulation driveJ. Exp. Psychol. 40 228CrossRefGoogle ScholarPubMed
Haruno, M.Kawato, M. 2006 Heterarchical reinforcement-learning model for integration of multiple cortico-striatal loops: fMRI examination in stimulus-action-reward association learningNeural Networks 19 1242CrossRefGoogle ScholarPubMed
Hayes-Roth, B.Hayes-Roth, F. 1979 A cognitive model of planningCogn. Sci. 3 275CrossRefGoogle Scholar
Hengst, B. 2002 Discovering hierarchy in reinforcement learning with HEXQP. Int. C. Mach. Learn. 19 243Google Scholar
Holroyd, C. B.Coles, M. G. H. 2002 The neural basis of human error processing: Reinforcement learning, dopamine, and the error-related negativityPsychol. Rev. 109 679CrossRefGoogle ScholarPubMed
Hoshi, E.Shima, K.Tanji, J. 1998 Task-dependent selectivity of movement-related neuronal activity in the primate prefrontal cortexJ. Neurophysiol. 80 3392CrossRefGoogle ScholarPubMed
Houk, J. C.Adams, C. M.Barto, A. G. 1995 A model of how the basal ganglia generate and use neural signals that predict reinforcementModels of Information Processing in the Basal GangliaHouk, J. CDavis, D. G.Cambridge, MAMIT Press249Google Scholar
Joel, D.Niv, Y.Ruppin, E. 2002 Actor-critic models of the basal ganglia: new anatomical and computational perspectivesNeural Networks 15 535CrossRefGoogle ScholarPubMed
Johnston, K.Everling, S. 2006 Neural activity in monkey prefrontal cortex is modulated by task context and behavioral instruction during delayed-match-to-sample and conditional prosaccade–antisaccade tasksJ. Cogn. Neurosci. 18 749CrossRefGoogle ScholarPubMed
Jonsson, A.Barto, A. 2001 Automated state abstraction for options using the U-tree algorithmAdvances in Neural Information Processing SystemsCambridge, MAMIT Press1054Google Scholar
Jonsson, A.Barto, A. 2005
Kambhampati, S.Mali, A. D.Srivastava, B. 1998 Hybrid planning for partially hierarchical domainsProceedings of the Fifteenth National Conference on Artificial Intelligence (AAAI-98)Madison, WIAAAI Press882Google Scholar
Kaplan, F.Oudeyer, P.-Y. 2004 Maximizing learning progress: an internal reward system for developmentEmbodied Artificial IntelligenceIida, F.Pfeifer, R.Steels, L.BerlinSpringer-Verlag259CrossRefGoogle Scholar
Kearns, M.Singh, S. 2002 Near-optimal reinforcement learning in polynomial timeMach. Learn. 49 209CrossRefGoogle Scholar
Koechlin, E.Ody, C.Kouneiher, F. 2003 The architecture of cognitive control in the human prefrontal cortexScience 302 1181CrossRefGoogle ScholarPubMed
Krueger, K. A.Dayan, P. 2008 Flexible Shaping. Presented atCosyne (Computational and Systems Neuroscience)Salt Lake CityUtahGoogle Scholar
Laird, J. E.Rosenbloom, P. S.Newell, A. 1986 Chunking in soar: the anatomy of a general learning mechanismMach. Learn. 1 11CrossRefGoogle Scholar
Landrum, E. R. 2005 Production of negative transfer in a problem-solving taskPsychol. Rep. 97 861CrossRefGoogle Scholar
Lashley, K. S. 1951 The problem of serial order in behaviorCerebral Mechanisms in Behavior: The Hixon SymposiumJeffress, L. ANew York, NYWiley112Google Scholar
Lee, I. H.Seitz, A. R.Assad, J. A. 2006 Activity of tonically active neurons in the monkey putamen during initiation and withholding of movementJ. Neurophysiol. 95 2391CrossRefGoogle Scholar
Lee, F. J.Taatgen, N. A. 2003 Production compilation: a simple mechanism to model complex skill acquisitionHum. Factors61Google Scholar
Lehman, J. F.Laird, J.Rosenbloom, P. 1996 A gentle introduction to Soar, an architecture for human cognitionInvitation to Cognitive ScienceSternberg, S.Scarborough, D.Cambridge, MAMIT Press212Google Scholar
Li, L.Walsh, T. J. 2006
Logan, G. D. 2003 Executive control of thought and action: in search of the wild homunculusCurr. Dir. Psychol. Sci. 12 45CrossRefGoogle Scholar
Luchins, A. S. 1942 Mechanization in problem solvingPsychol. Monogr. 248 1Google Scholar
MacDonald, A. W., J. D.Stenger, V. A.Carter, C. S. 2000 Dissociating the role of the dorsolateral prefrontal and anterior cingulate cortex in cognitive controlScience 288 1835CrossRefGoogle ScholarPubMed
MacKay, D. G. 1987 The Organization of Perception and Action: A Theory for Language and Other Cognitive SkillsNew YorkSpringer-VerlagCrossRefGoogle Scholar
Mannor, S.Menache, I.Hoze, A.Klein, U. 2004 Dynamic abstraction in reinforcement learning via clusteringProceedings of the Twenty-First International Conference on Machine LearningNew YorkACM Press560Google Scholar
Marthi, B.Russell, S. J.Wolfe, J. 2007 2007
McGovern, A. 2002
Mehta, S.Ray, P.Tadepalli, P.Dietterich, T. 2008 Automatic discovery and transfer of MAXQ hierarchies. Paper presented atInternational Conference on Machine LearningHelsinkiFinlandGoogle Scholar
Meltzoff, A. N. 1995 Understanding the intentions of others: re-enactment of intended acts by 18-month-old childrenDev. Psychol. 31 838CrossRefGoogle ScholarPubMed
Menache, I.Mannor, S.Shimkin, N. 2002
Middleton, F. A.Strick, P. L. 2002 Basal-ganglia ‘projections’ to the prefrontal cortex of the primateCereb. Cortex 12 926CrossRefGoogle ScholarPubMed
Miller, E. K.Cohen, J. D. 2001 An integrative theory of prefrontal cortex functionAnnu. Rev. Neurosci.167CrossRefGoogle ScholarPubMed
Miller, G. A.Galanter, E.Pribram, K. H. 1960 Plans and the Structure of BehaviorNew YorkHolt, Rinehart and WinstonCrossRefGoogle Scholar
Minton, S.Hayes, P. J.Fain, J. 1985 Controlling Search in Flexible Parsing
Miyamoto, H.Morimoto, J.Doya, K.Kawato, M. 2004 Reinforcement learning with via-point representationNeural Networks 17 299CrossRefGoogle ScholarPubMed
Monsell, S. 2003 Task switchingTrends Cogn. Sci. 7 134CrossRefGoogle ScholarPubMed
Monsell, S.Yeung, N.Azuma, R. 2000 Reconfiguration of task-set: is it easier to switch to the weaker taskPsychol. Res. 63 250CrossRefGoogle ScholarPubMed
Montague, P. R.Dayan, P.Sejnowski, T. J. 1996 A framework for mesencephalic dopamine based on predictive Hebbian learningJ. Neurosci. 16 1936CrossRefGoogle ScholarPubMed
Morris, G.Arkadir, D.Nevet, A.Vaadia, E.Bergman, H. 2004 Coincident but distinct messages of midbrain dopamine and striatal tonically active neuronsNeuron 43 133CrossRefGoogle ScholarPubMed
Muhammad, R.Wallis, J. D.Miller, E. K. 2006 A comparison of abstract rules in the prefrontal cortex, premotor cortex, inferior temporal cortex, and striatumJ. Cogn. Neurosci. 18 974CrossRefGoogle ScholarPubMed
Nason, S.Laird, J. E. 2005 Soar-RL: integrating reinforcement learning with SoarCogn. Syst. Res. 6 51CrossRefGoogle Scholar
Newell, A.Simon, H. A. 1963 GPS, a program that simulates human thoughtComputers and ThoughtFeigenbaum, E. AFeldman, J.New YorkMcGraw-Hill279Google Scholar
Newtson, D. 1976 Foundations of attribution: the perception of ongoing behaviorNew Directions in Attribution ResearchHarvey, J. H.Ickes, W. J.Kidd, R. F.Hillsdale, NJErlbaum223Google Scholar
O’Doherty, J.Critchley, H.Deichmann, R.Dolan, R. J. 2003 Dissociating valence of outcome from behavioral control in human orbital and ventral prefrontal corticesJ. Neurosci 79Google Scholar
O’Doherty, J.Dayan, P.Schultz, P. 2004 Dissociable roles of ventral and dorsal striatum in instrumental conditioningScience 304 452CrossRefGoogle ScholarPubMed
O’Reilly, R. CFrank, M. J. 2006 Making working memory work: a computational model of learning in prefrontal cortex and basal gangliaNeural Comput 18 283CrossRefGoogle ScholarPubMed
Oudeyer, P.-Y.Kaplan, F.Hafner, V. 2007 Intrinsic motivation systems for autonomous developmentIEE T. Evol. Comput. 11 265CrossRefGoogle Scholar
Parent, A.Hazrati, L. N. 1995 Functional anatomy of the basal ganglia. I. The cortico-basal ganglia-thalamo-cortical loopBrain Res. Rev. 20 91CrossRefGoogle ScholarPubMed
Parr, R.Russell, S. 1998 Reinforcement learning with hierarchies of machinesAdv. Neural Inf. Proc. Syst. 10 1043Google Scholar
Pashler, H. 1994 Dual-task interference in simple tasks: data and theoryPsychol. Bull. 116 220CrossRefGoogle ScholarPubMed
Petrides, M. 1995 Impairments on nonspatial self-ordered and externally ordered working memory tasks after lesions to the mid-dorsal part of the lateral frontal cortex in the monkeyJ. Neurosci. 15 359CrossRefGoogle ScholarPubMed
Piaget, J. 1936 The Origins of Intelligence in ChildrenNew YorkInternational Universities PressGoogle Scholar
Pickett, M.Barto, A. G. 2002 PolicyBlocks: an algorithm for creating useful macro-actions in reinforcement learningMachine Learning: Proceedings of the Nineteenth International Conference on Machine LearningSammut, C.Hoffmann, A.San FranciscoMorgan Kaufmann506Google Scholar
Postle, B. R. 2006 Working memory as an emergent property of the mind and brainNeurosci. 139 23CrossRefGoogle ScholarPubMed
Ravel, S.Sardo, P.Legallet, E.Apicella, P. 2006 Influence of spatial information on responses of tonically active neurons in the monkey striatumJ. Neurophysiol 95 2975CrossRefGoogle ScholarPubMed
Rayman, W. E. 1982 Negative transfer: a threat to flying safetyAviat. Space Envir. Md.1224Google ScholarPubMed
Reason, J. T. 1992 Human ErrorCambridgeCambridge University PressGoogle Scholar
Redgrave, P.Gurney, K. 2006 The short-latency dopamine signal: a role in discovering novel actionsNat. Rev. Neurosci. 7 967CrossRefGoogle ScholarPubMed
Roesch, M. R.Taylor, A. R.Schoenbaum, G. 2006 Encoding of time-discounted rewards in orbitofrontal cortex is independent of valueNeuron 51 509CrossRefGoogle ScholarPubMed
Rolls, E. T. 2004 The functions of the orbitofrontal cortexBrain Cogn. 55 11CrossRefGoogle ScholarPubMed
Rougier, N. P., D. C.Braver, T. S.Cohen, J. D.O’Reilly, R. C. 2005 Prefrontal cortex and flexible cognitive control: rules without symbolsProc. Nat. Acad. Sci 102 7338CrossRefGoogle ScholarPubMed
Ruh, N. 2007
Rumelhart, D.Norman, D. A. 1982 Simulating a skilled typist: a study of skilled cognitive-motor performanceCogn. Sci. 6 1CrossRefGoogle Scholar
Rushworth, M. F. S.Walton, M. E.Kennerley, S. W.Bannerman, D. M. 2004 Action sets and decisions in the medial frontal cortexTrends Cogn. Sci. 8 410CrossRefGoogle ScholarPubMed
Ryan, R. M.Deci, E. L. 2000 Intrinsic and extrinsic motivationContemp. Edu. Psychol. 25 54CrossRefGoogle Scholar
Saffran, J. R.Aslin, R. N.Newport, E. L. 1996 Statistical learning by 8-month-old infantsScience 13 1926CrossRefGoogle Scholar
Saffran, J. R.Wilson, D. P. 2003 From syllables to syntax: multilevel statistical learning by 12-month-old infantsInfancy 4 273CrossRefGoogle Scholar
Salinas, E. 2004 Fast remapping of sensory stimuli onto motor actions on the basis of contextual modulationJ. Neurosci. 24 1113CrossRefGoogle ScholarPubMed
Schank, R. C.Abelson, R. P. 1977 Scripts, Plans, Goals and UnderstandingHillsdale, NJErlbaum.Google Scholar
Schmidhuber, J. 1991 A possibility for implementing curiosity and boredom in model-building neural controllersFrom Animals to Animats: Proceedings of the First International Conference on Simulation of Adaptive BehaviorCambridgeMIT Press222Google Scholar
Schneider, D. W.Logan, G. D. 2006 Hierarchical control of cognitive processes: switching tasks in sequencesJ. Exp. Psychol. 135 623CrossRefGoogle ScholarPubMed
Schoenbaum, G.Chiba, A. A.Gallagher, M. 1999 Neural encoding in orbitofrontal cortex and basolateral amygdala during olfactory discrimination learningJ. Neurosci. 19 1876CrossRefGoogle ScholarPubMed
Schultz, W.Apicella, P.Ljungberg, T. 1993 Responses of monkey dopamine neurons to reward and conditioned stimuli during successive steps of learning a delayed response taskJ. Neurosci. 13 900CrossRefGoogle ScholarPubMed
Schultz, W.Dayan, P.Montague, P. R. 1997 A neural substrate of prediction and rewardScience 275 1593CrossRefGoogle ScholarPubMed
Schultz, W.Tremblay, K. L.Hollerman, J. R. 2000 Reward processing in primate orbitofrontal cortex and basal gangliaCereb. Cortex 10 272CrossRefGoogle ScholarPubMed
Shallice, T.Burgess, P. W. 1991 Deficits in strategy application following frontal lobe damage in manBrain 114 727CrossRefGoogle ScholarPubMed
Shima, K.Isoda, M.Mushiake, H.Tanji, J. 2007 Categorization of behavioural sequences in the prefrontal cortexNature 445 315CrossRefGoogle ScholarPubMed
Shima, K.Tanji, J. 2000 Neuronal activity in the supplementary and presupplementary motor areas for temporal organization of multiple movementsJ. Neurophysiol. 84 2148CrossRefGoogle ScholarPubMed
Shimamura, A. P. 2000 The role of the prefrontal cortex in dynamic filteringPsychobiol. 28 207Google Scholar
Simsek, O.Wolfe, A.Barto, A. 2005 Identifying useful subgoals in reinforcement learning by local graph partitioningProceedings of the Twenty-Second International Conference on Machine Learning (ICML 05)New YorkACM816CrossRefGoogle Scholar
Singh, S.Barto, A. G.Chentanez, N. 2005 Intrinsically motivated reinforcement learningAdvances in Neural Information Processing Systems 17: Proceedings of the 2004 ConferenceSaul, L. KWeiss, Y.Bottou, L.Cambridge, MAMIT PressGoogle Scholar
Sirigu, A.Zalla, T.Pillon, B. 1995 Selective impairments in managerial knowledge in patients with pre-frontal cortex lesionsCortex 31 301CrossRefGoogle Scholar
Sommerville, J.Woodward, A. L. 2005 Pulling out the intentional structure of action: the relation between action processing and action production in infancyCognition1CrossRefGoogle ScholarPubMed
Sommerville, J. A.Woodward, A. L. 2005 Infants’ sensitivity to the causal features of means–end support sequences in action and perceptionInfancy 8 119CrossRefGoogle Scholar
Suri, R. E.Bargas, J.Arbib, M. A. 2001 Modeling functions of striatal dopamine modulation in learning and planningNeurosci. 103 65CrossRefGoogle ScholarPubMed
Sutton, R. S.Barto, A. G. 1990 Time-derivative models of Pavlovian reinforcementLearning and Computational Neuroscience: Foundations of Adaptive, NetworksGabriel, MMoore, J.Cambridge, MAMIT Press497Google Scholar
Sutton, R. S.Barto, A. G. 1998 Reinforcement Learning: An IntroductionCambridge, MAMIT PressGoogle Scholar
Sutton, R. S.Precup, D.Singh, S. 1999 Between MDPs and semi-MDPs: a framework for temporal abstraction in reinforcement learningArtif. Intell 112 181CrossRefGoogle Scholar
Tenenbaum, J. BSaxe, R. R. 2006 Bayesian Models of Action UnderstandingCambridge, MAMIT PressGoogle Scholar
Thrun, S. B.Scwhartz, A. 1995 Finding structure in reinforcement learningAdvances in Neural Information Processing Systems: Proceedings of the 1994 ConferenceTesauro, G.Touretzky, D. S.Leen, T.Cambridge, MAMIT Press385Google Scholar
Wallis, J. D.Anderson, K. C.Miller, E. K. 2001 Single neurons in prefrontal cortex encode abstract rulesNature 411 953CrossRefGoogle ScholarPubMed
Wallis, J. D.Miller, E. K. 2003 From rule to response: neuronal processes in the premotor and prefrontal cortexJ. Neurophysiol. 90 1790CrossRefGoogle ScholarPubMed
Ward, GAllport, A. 1997 Planning and problem-solving using the five-disc Tower of London taskQ. J. Exp. Psychol 50CrossRefGoogle Scholar
White, I. M. 1999 Rule-dependent neuronal activity in the prefrontal cortexExp. Brain Res. 126 315CrossRefGoogle ScholarPubMed
White, R. W. 1959 Motivation reconsidered: the concept of competencePsychol. Rev. 66 297CrossRefGoogle ScholarPubMed
Wickens, J.Kotter, R.Houk, J. C. 1995 Cellular models of reinforcementModels of Information Processing in the Basal GangliaDavis, J. LBeiser, D. G.Cambridge, MAMIT Press187Google Scholar
Wolpert, D.Flanagan, J. 2001 Motor predictionCurr. Biol. 18 R729CrossRefGoogle Scholar
Wood, J. N.Grafman, J. 2003 Human prefrontal cortex: processing and representational perspectivesNature Rev. Neurosci 4 139CrossRefGoogle ScholarPubMed
Woodward, A. L.Sommerville, J. A.Guajardo, , J. J. 2001 How infants make sense of intentional actionMalle, B. FMoses, L. JBaldwin, D. A.Intentions and Intentionality: Foundations of Social CognitionCambridge, MAMIT Press149Google Scholar
Yamada, S.Tsuji, S. 1989
Yan, Z.Fischer, K. 2002 Always under construction: dynamic variations in adult cognitive microdevelopmentHum. Dev. 45 141CrossRefGoogle Scholar
Zacks, J. M.Braver, T. S.Sheridan, M. A. 2001 Human brain activity time-locked to perceptual event boundariesNature Neurosci. 4 651CrossRefGoogle ScholarPubMed
Zacks, J. M.Speer, N. K.Swallow, K. M.Braver, T. S.Reynolds, J. R. 2007 Event perception: a mind/brain perspectivePsychol.Bull. 133 273CrossRefGoogle ScholarPubMed
Zacks, J. M.Tversky, B. 2001 Event structure in perception and conceptionPsychol. Bull. 127 3CrossRefGoogle ScholarPubMed
Zalla, TP. Pradat-Diehl, Sirigu, A. 2003 Perception of action boundaries in patients with frontal lobe damageNeuropsychologia 41 1619CrossRefGoogle ScholarPubMed
Zhou, W.Coggins, R. 2002 Computational models of the amygdala and the orbitofrontal cortex: a hierarchical reinforcement learning system for robotic controlLecture Notes AI: LNAI 2557McKay, I.Slaney, J.BerlinSpringer-Verlag419Google Scholar
Zhou, WCoggins, R. 2004 Biologically inspired reinforcement learning: reward-based decomposition for multi-goal environmentsBiologically Inspired Approaches to Advanced Information TechnologyIjspeert, A. JMurata, M.Wakamiya, N.BerlinSpringer-VerlagGoogle Scholar

Save book to Kindle

To save this book to your Kindle, first ensure coreplatform@cambridge.org is added to your Approved Personal Document E-mail List under your Personal Document Settings on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part of your Kindle email address below. Find out more about saving to your Kindle.

Note you can select to save to either the @free.kindle.com or @kindle.com variations. ‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi. ‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.

Find out more about the Kindle Personal Document Service.

Available formats
×

Save book to Dropbox

To save content items to your account, please confirm that you agree to abide by our usage policies. If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account. Find out more about saving content to Dropbox.

Available formats
×

Save book to Google Drive

To save content items to your account, please confirm that you agree to abide by our usage policies. If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account. Find out more about saving content to Google Drive.

Available formats
×