The double-edged sword effect of conformity on cooperation in spatial Prisoner’s Dilemma Games with reinforcement learning

Imitation based on fitness comparison has long been a representative strategy updating method in evolutionary game theory, with the pursuit of profit maximization at its core. However, the method fails when obtaining other agents’ income information is inaccessible or prohibitively expensive. As an...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Chaos, solitons and fractals solitons and fractals, 2024-10, Vol.187, p.115483, Article 115483
Hauptverfasser:	Wang, Pai, Yang, Zhihu
Format:	Artikel
Sprache:	eng
Schlagworte:	Conformity Evolution of cooperation Prisoner’s Dilemma Game Reinforcement learning
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	Imitation based on fitness comparison has long been a representative strategy updating method in evolutionary game theory, with the pursuit of profit maximization at its core. However, the method fails when obtaining other agents’ income information is inaccessible or prohibitively expensive. As an alternative, reinforcement learning has been frequently used to alleviate this problem, yet it rarely achieves socially optimal outcomes. To fill this gap, this study proposes a self-regarding Q-learning with conformity effect and investigates its impact on the evolution of cooperation in a spatial Prisoner’s Dilemma Game. The instant reward of Q-learning is re-scaled following the logic that the more popular the strategy, the higher the re-scaled reward. The results reveal that reinforcement learning can alleviate social dilemmas in a way that prevents both sides of a game from freezing in a dilemma of mutual defection, thereby facilitating the coexistence of cooperation and defection. Depending on the dilemma strength, intriguingly, conformity has a two-sided effect on the evolution of cooperation. For lower b-values it promotes cooperation whereas for higher b-values it hinders cooperation. The reasons behind these phenomena are analyzed and the simulation results are shown to be consistent with the theoretical analysis. •A novel reinforcement learning with conformity effect is proposed.•Conformity is a double-edged sword for the evolution of cooperation under reinforcement learning.•Conformity enables reinforcement learning to yield socially optimal outcomes.
ISSN:	0960-0779
DOI:	10.1016/j.chaos.2024.115483