Hostname: page-component-89b8bd64d-n8gtw Total loading time: 0 Render date: 2026-05-12T10:48:12.429Z Has data issue: false hasContentIssue false

Team learning from human demonstration with coordination confidence

Published online by Cambridge University Press:  05 November 2019

Bikramjit Banerjee
Affiliation:
School of Computing Sciences and Computer Engineering, University of Southern Mississippi, Hattiesburg, MS 39406, USA; e-mail: Bikramjit.Banerjee@usm.edu
Syamala Vittanala
Affiliation:
School of Computing Sciences and Computer Engineering, University of Southern Mississippi, Hattiesburg, MS 39406, USA; e-mail: Bikramjit.Banerjee@usm.edu
Matthew Edmund Taylor
Affiliation:
School of Electrical Engineering & Computer Science, Washington State University, Pullman, WA 99164, USA e-mail: taylorm@eecs.wsu.edu

Abstract

Among an array of techniques proposed to speed-up reinforcement learning (RL), learning from human demonstration has a proven record of success. A related technique, called Human-Agent Transfer, and its confidence-based derivatives have been successfully applied to single-agent RL. This article investigates their application to collaborative multi-agent RL problems. We show that a first-cut extension may leave room for improvement in some domains, and propose a new algorithm called coordination confidence (CC). CC analyzes the difference in perspectives between a human demonstrator (global view) and the learning agents (local view) and informs the agents’ action choices when the difference is critical and simply following the human demonstration can lead to miscoordination. We conduct experiments in three domains to investigate the performance of CC in comparison with relevant baselines.

Information

Type
Research Article
Copyright
© Cambridge University Press, 2019 

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Article purchase

Temporarily unavailable