Knowledge-Based Multiagent Credit Assignment: A Study on Task Type and Critic Information

Multiagent credit assignment (MCA) is one of the major problems in the realization of multiagent reinforcement learning. Since the environment usually is not intelligent enough to qualify individual agents in a cooperative team, it is very important to develop some methods for assigning individual a...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE systems journal 2007-09, Vol.1 (1), p.55-67
Hauptverfasser: Harati, A., Ahmadabadi, M.N., Araabi, B.N.
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Multiagent credit assignment (MCA) is one of the major problems in the realization of multiagent reinforcement learning. Since the environment usually is not intelligent enough to qualify individual agents in a cooperative team, it is very important to develop some methods for assigning individual agents' credits when just a single team reinforcement is available. MCA cannot be solved in general cases, using a single technique. Therefore, our goal in this research is first to present a new view of the problem and second, to introduce a new idea of using agents' knowledge to partially solve MCA. In this research, an approach that is based on agents' learning histories and knowledge is proposed to solve the MCA problem. Knowledge evaluation-based credit assignment (KEBCA) along with certainty, a measure of agents' knowledge, is developed to judge agents' actions and to assign them proper credits. The proposed KEBCA method is general, however; we study it in some simulated extreme cases in order to gain a better insight into MCA problem and to evaluate our approach in such cases. More specifically, we study the effects of task type (and-type and or-type tasks) on solving MCA problem in two cases. In the first case, in addition to the team reinforcement, it is assumed that some extra information at the team level is available. In the second case, such extra information does not exist. In addition, performance of the system is examined in presence of some uncertainties in the environment, modeled as noise on agents' actions. The information content of team reinforcements and assumed extra information are theoretically calculated and discussed. The mathematical calculations confirm the related simulation results.
ISSN:1932-8184
1937-9234
DOI:10.1109/JSYST.2007.901641