A comparative vignette study: Evaluating the potential role of a generative AI model in enhancing clinical decision-making in nursing

This study explores the potential of a generative artificial intelligence tool (ChatGPT) as clinical support for nurses. Specifically, we aim to assess whether ChatGPT can demonstrate clinical decision-making equivalent to that of expert nurses and novice nursing students. This will be evaluated by...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Journal of advanced nursing 2024-02
Hauptverfasser: Saban, Mor, Dubovi, Ilana
Format: Artikel
Sprache:eng
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:This study explores the potential of a generative artificial intelligence tool (ChatGPT) as clinical support for nurses. Specifically, we aim to assess whether ChatGPT can demonstrate clinical decision-making equivalent to that of expert nurses and novice nursing students. This will be evaluated by comparing ChatGPT responses to clinical scenarios to those of nurses on different levels of experience. This is a cross-sectional study. Emergency room registered nurses (i.e. experts; n = 30) and nursing students (i.e. novices; n = 38) were recruited during March-April 2023. Clinical decision-making was measured using three validated clinical scenarios involving an initial assessment and reevaluation. Clinical decision-making aspects assessed were the accuracy of initial assessments, the appropriateness of recommended tests and resource use and the capacity to reevaluate decisions. Performance was also compared by timing response generations and word counts. Expert nurses and novice students completed online questionnaires (via Qualtrics), while ChatGPT responses were obtained from OpenAI. Concerning aspects of clinical decision-making and compared to novices and experts: (1) ChatGPT exhibited indecisiveness in initial assessments; (2) ChatGPT tended to suggest unnecessary diagnostic tests; (3) When new information required re-evaluation, ChatGPT responses demonstrated inaccurate understanding and inappropriate modifications. In terms of performance, the mean number of words utilized in ChatGPT answers was 27-41 times greater than that utilized by both experts and novices; and responses were provided approximately 4 times faster than those of novices and twice faster than expert nurses. ChatGPT responses maintained logical structure and clarity. A generative AI tool demonstrated indecisiveness and a tendency towards over-triage compared to human clinicians. The study shows that it is important to approach the implementation of ChatGPT as a nurse's digital assistant with caution. More study is needed to optimize the model's training and algorithms to provide accurate healthcare support that aids clinical decision-making. This study adhered to relevant EQUATOR guidelines for reporting observational studies. Patients were not directly involved in the conduct of this study.
ISSN:0309-2402
1365-2648
DOI:10.1111/jan.16101