A User Study on Explainable Online Reinforcement Learning for Adaptive Systems

Online reinforcement learning (RL) is increasingly used for realizing adaptive systems in the presence of design time uncertainty because Online RL can leverage data only available at run time. With Deep RL gaining interest, the learned knowledge is no longer represented explicitly but hidden in the...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	ACM transactions on autonomous and adaptive systems 2024-09, Vol.19 (3), p.1-44, Article 15
Hauptverfasser:	Metzger, Andreas, Laufer, Jan, Feit, Felix, Pohl, Klaus
Format:	Artikel
Sprache:	eng
Schlagworte:	Computing methodologies Designing software Machine learning Software and its engineering
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	Online reinforcement learning (RL) is increasingly used for realizing adaptive systems in the presence of design time uncertainty because Online RL can leverage data only available at run time. With Deep RL gaining interest, the learned knowledge is no longer represented explicitly but hidden in the parameterization of the underlying artificial neural network. For a human, it thus becomes practically impossible to understand the decision-making of Deep RL, which makes it difficult for (1) software engineers to perform debugging, (2) system providers to comply with relevant legal frameworks, and (3) system users to build trust. The explainable RL technique XRL-DINE, introduced in earlier work, provides insights into why certain decisions were made at important time steps. Here, we perform an empirical user study concerning XRL-DINE involving 73 software engineers split into treatment and control groups. The treatment group is given access to XRL-DINE, while the control group is not. We analyze (1) the participants’ performance in answering concrete questions related to the decision-making of Deep RL, (2) the participants’ self-assessed confidence in giving the right answers, (3) the perceived usefulness and ease of use of XRL-DINE, and (4) the concrete usage of the XRL-DINE dashboard.
ISSN:	1556-4665 1556-4703
DOI:	10.1145/3666005