Analysis of Responses of GPT-4 V to the Japanese National Clinical Engineer Licensing Examination

Chat Generative Pretrained Transformer (ChatGPT; OpenAI) is a state-of-the-art large language model that can simulate human-like conversations based on user input. We evaluated the performance of GPT-4 V in the Japanese National Clinical Engineer Licensing Examination using 2,155 questions from 2012...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Journal of medical systems 2024-09, Vol.48 (1), p.83, Article 83
Hauptverfasser:	Ishida, Kai, Arisaka, Naoya, Fujii, Kiyotaka
Format:	Artikel
Sprache:	eng
Schlagworte:	Artificial intelligence Basic converters Biological properties Biomedical Engineering - organization & administration Chatbots Clinical medicine East Asian People Educational Measurement - methods Electronic engineering Engineers Health Informatics Health Sciences Humans Japan Large language models Licensing Licensing examinations Licensure - standards Mechanical engineering Medical devices Medical materials Medicine Medicine & Public Health Questions Safety management State-of-the-art reviews Statistics for Life Sciences
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	Chat Generative Pretrained Transformer (ChatGPT; OpenAI) is a state-of-the-art large language model that can simulate human-like conversations based on user input. We evaluated the performance of GPT-4 V in the Japanese National Clinical Engineer Licensing Examination using 2,155 questions from 2012 to 2023. The average correct answer rate for all questions was 86.0%. In particular, clinical medicine, basic medicine, medical materials, biological properties, and mechanical engineering achieved a correct response rate of ≥ 90%. Conversely, medical device safety management, electrical and electronic engineering, and extracorporeal circulation obtained low correct answer rates ranging from 64.8% to 76.5%. The correct answer rates for questions that included figures/tables, required numerical calculation, figure/table ∩ calculation, and knowledge of Japanese Industrial Standards were 55.2%, 85.8%, 64.2% and 31.0%, respectively. The reason for the low correct answer rates is that ChatGPT lacked recognition of the images and knowledge of standards and laws. This study concludes that careful attention is required when using ChatGPT because several of its explanations lack the correct description.
ISSN:	1573-689X 0148-5598 1573-689X
DOI:	10.1007/s10916-024-02103-w