Retrospective End-User Walkthrough: A Method for Assessing How People Combine Multiple AI Models in Decision-Making Systems
Evaluating human-AI decision-making systems is an emerging challenge as new ways of combining multiple AI models towards a specific goal are proposed every day. As humans interact with AI in decision-making systems, multiple factors may be present in a task including trust, interpretability, and exp...
Gespeichert in:
Hauptverfasser: | , , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Evaluating human-AI decision-making systems is an emerging challenge as new
ways of combining multiple AI models towards a specific goal are proposed every
day. As humans interact with AI in decision-making systems, multiple factors
may be present in a task including trust, interpretability, and explainability,
amongst others. In this context, this work proposes a retrospective method to
support a more holistic understanding of how people interact with and connect
multiple AI models and combine multiple outputs in human-AI decision-making
systems. The method consists of employing a retrospective end-user walkthrough
with the objective of providing support to HCI practitioners so that they may
gain an understanding of the higher order cognitive processes in place and the
role that AI model outputs play in human-AI decision-making. The method was
qualitatively assessed with 29 participants (four participants in a pilot
phase; 25 participants in the main user study) interacting with a human-AI
decision-making system in the context of financial decision-making. The system
combines visual analytics, three AI models for revenue prediction, AI-supported
analogues analysis, and hypothesis testing using external news and natural
language processing to provide multiple means for comparing companies. Beyond
results on tasks and usability problems, outcomes presented suggest that the
method is promising in highlighting why AI models are ignored, used, or
trusted, and how future interactions are planned. We suggest that HCI
practitioners researching human-AI interaction can benefit by adding this step
to user studies in a debriefing stage as a retrospective Thinking-Aloud
protocol would be applied, but with emphasis on revisiting tasks and
understanding why participants ignored or connected predictions while
performing a task. |
---|---|
DOI: | 10.48550/arxiv.2305.07530 |