Skip to main content Accessibility help

Team learning from human demonstration with coordination confidence

  • Bikramjit Banerjee (a1), Syamala Vittanala (a1) and Matthew Edmund Taylor (a2)


Among an array of techniques proposed to speed-up reinforcement learning (RL), learning from human demonstration has a proven record of success. A related technique, called Human-Agent Transfer, and its confidence-based derivatives have been successfully applied to single-agent RL. This article investigates their application to collaborative multi-agent RL problems. We show that a first-cut extension may leave room for improvement in some domains, and propose a new algorithm called coordination confidence (CC). CC analyzes the difference in perspectives between a human demonstrator (global view) and the learning agents (local view) and informs the agents’ action choices when the difference is critical and simply following the human demonstration can lead to miscoordination. We conduct experiments in three domains to investigate the performance of CC in comparison with relevant baselines.



Hide All
Argall, B. D., Chernova, S., Veloso, M. & Browning, B. 2009. A survey of robot learning from demonstration. Robotics and Autonomous Systems 57(5), 469483.
Chernova, S. & Veloso, M. 2007. Multiagent collaborative task learning through imitation. In Proceedings of the 4th International Symposium on Imitation in Animals and Artifacts (AIBS-07), Artificial and Ambient Intelligence.
da Silva, F. L., Glatt, R. & Costa, A. H. R. 2017. Simultaneously learning and advising in multiagent reinforcement learning. In Proceedings of the 16th Conference on Autonomous Agents and MultiAgent Systems (AAMAS-17).
Fernandez, F., Garcia, J. & Veloso, M. 2010. Probabilistic policy reuse for inter-task transfer learning. Robotics and Autonomous Systems 58(7), 866871.
Fudenberg, D. & Levine, K. 1998. The Theory of Learning in Games. MIT Press.
Kraemer, L. & Banerjee, B. 2016. Multi-agent reinforcement learning as a rehearsal for decentralized planning. Neurocomputing 190, 8294.
Le, H. M., Yue, Y., Carr, P. & Lucey, P. 2017. Coordinated multi-agent imitation learning. In Proceedings of the 34th International Conference on Machine Learning (ICML-17).
MacGlashan, J. 2014. The Brown-UMBC reinforcement learning and planning (BURLAP) library,
Mnih, V., Kavukcuoglu, K., Silver, D., Rusu, A. A., Veness, J., Bellemare, M. G., Graves, A., Riedmiller, M., Fidjeland, A. K., Ostrovski, G., Petersen, S., Beattie, C., Sadik, A., Antonoglou, I., King, H., Kumaran, D., Wierstra, D., Legg, S. & Hassabis, D. 2015. Human-level control through deep reinforcement learning. Nature 518, 529533.
Silver, D., Huang, A., Maddison, C. J., Guez, A., Sifre, L., van den Driessche, G., Schrittwieser, J., Antonoglou, I., Panneershelvam, V., Lanctot, M., Dieleman, S., Grewe, D., Nham, J., Kalchbrenner, N., Sutskever, I., Lillicrap, T., Leach, M., Kavukcuoglu, K., Graepel, T. & Hassabis, D. 2016. Mastering the game of Go with deep neural networks and tree search. Nature 529, 484489.
Song, J., Ren, H., Sadigh, D. & Ermon, S. 2018. Multi-Agent Generative Adversarial Imitation Learning. In Proceedings of the 32nd Conference on Neural Information Processing Systems (NeurIPS 2018).
Sutton, R. & Barto, A. G. 1998. Reinforcement Learning: An Introduction, MIT Press.
Taylor, M. E. & Stone, P. 2009. Transfer learning for reinforcement learning domains: A survey. Journal of Machine Learning Research 10(1), 16331685.
Taylor, M. E., Suay, H. B. & Chernova, S. 2011. Integrating reinforcement learning with human demonstrations of varying ability. In Proceedings of the International Conference on Autonomous Agents and Multiagent Systems (AAMAS).
Wang, Z. & Taylor, M. E. 2017. Improving reinforcement learning with confidence-based demonstrations. In Proceedings of the 26th International Conference on Artificial Intelligence (IJCAI).
Wang, Z. & Taylor, M. E. 2019, Interactive reinforcement learning with dynamic reuse of prior knowledge from human/agent’s demonstration. In Proceedings of the 28th International Joint Conference on Artificial Intelligence (IJCAI).

Team learning from human demonstration with coordination confidence

  • Bikramjit Banerjee (a1), Syamala Vittanala (a1) and Matthew Edmund Taylor (a2)


Altmetric attention score

Full text views

Total number of HTML views: 0
Total number of PDF views: 0 *
Loading metrics...

Abstract views

Total abstract views: 0 *
Loading metrics...

* Views captured on Cambridge Core between <date>. This data will be updated every 24 hours.

Usage data cannot currently be displayed