Knowledge-Based Multiagent Credit Assignment: A Study on Task Type and Critic Information
Multiagent credit assignment (MCA) is one of the major problems in the realization of multiagent reinforcement learning. Since the environment usually is not intelligent enough to qualify individual agents in a cooperative team, it is very important to develop some methods for assigning individual a...
Gespeichert in:
Veröffentlicht in: | IEEE systems journal 2007-09, Vol.1 (1), p.55-67 |
---|---|
Hauptverfasser: | , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Multiagent credit assignment (MCA) is one of the major problems in the realization of multiagent reinforcement learning. Since the environment usually is not intelligent enough to qualify individual agents in a cooperative team, it is very important to develop some methods for assigning individual agents' credits when just a single team reinforcement is available. MCA cannot be solved in general cases, using a single technique. Therefore, our goal in this research is first to present a new view of the problem and second, to introduce a new idea of using agents' knowledge to partially solve MCA. In this research, an approach that is based on agents' learning histories and knowledge is proposed to solve the MCA problem. Knowledge evaluation-based credit assignment (KEBCA) along with certainty, a measure of agents' knowledge, is developed to judge agents' actions and to assign them proper credits. The proposed KEBCA method is general, however; we study it in some simulated extreme cases in order to gain a better insight into MCA problem and to evaluate our approach in such cases. More specifically, we study the effects of task type (and-type and or-type tasks) on solving MCA problem in two cases. In the first case, in addition to the team reinforcement, it is assumed that some extra information at the team level is available. In the second case, such extra information does not exist. In addition, performance of the system is examined in presence of some uncertainties in the environment, modeled as noise on agents' actions. The information content of team reinforcements and assumed extra information are theoretically calculated and discussed. The mathematical calculations confirm the related simulation results. |
---|---|
ISSN: | 1932-8184 1937-9234 |
DOI: | 10.1109/JSYST.2007.901641 |