Article contents
Plan-based reward shaping for multi-agent reinforcement learning
Published online by Cambridge University Press: 11 February 2016
Abstract
Recent theoretical results have justified the use of potential-based reward shaping as a way to improve the performance of multi-agent reinforcement learning (MARL). However, the question remains of how to generate a useful potential function.
Previous research demonstrated the use of STRIPS operator knowledge to automatically generate a potential function for single-agent reinforcement learning. Following up on this work, we investigate the use of STRIPS planning knowledge in the context of MARL.
Our results show that a potential function based on joint or individual plan knowledge can significantly improve MARL performance compared with no shaping. In addition, we investigate the limitations of individual plan knowledge as a source of reward shaping in cases where the combination of individual agent plans causes conflict.
- Type
- Articles
- Information
- The Knowledge Engineering Review , Volume 31 , Issue 1: Adaptive Learning Agents , January 2016 , pp. 44 - 58
- Copyright
- © Cambridge University Press, 2016
References
- 11
- Cited by