Performance of ChatGPT in Solving Questions From the Progress Test (Brazilian National Medical Exam): A Potential Artificial Intelligence Tool in Medical Practice

Background The use of artificial intelligence (AI) is not a recent phenomenon, but the latest advancements in this technology are making a significant impact across various fields of human knowledge. In medicine, this trend is no different, although it has developed at a slower pace. ChatGPT is an e...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Curēus (Palo Alto, CA) CA), 2024-07, Vol.16 (7), p.e64924
Hauptverfasser: Rodrigues Alessi, Mateus, Gomes, Heitor A, Lopes de Castro, Matheus, Terumy Okamoto, Cristina
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Background The use of artificial intelligence (AI) is not a recent phenomenon, but the latest advancements in this technology are making a significant impact across various fields of human knowledge. In medicine, this trend is no different, although it has developed at a slower pace. ChatGPT is an example of an AI-based algorithm capable of answering questions, interpreting phrases, and synthesizing complex information, potentially aiding and even replacing humans in various areas of social interest. Some studies have compared its performance in solving medical knowledge exams with medical students and professionals to verify AI accuracy. This study aimed to measure the performance of ChatGPT in answering questions from the Progress Test from 2021 to 2023. Methodology An observational study was conducted in which questions from the 2021 Progress Test and the regional tests (Southern Institutional Pedagogical Support Center II) of 2022 and 2023 were presented to ChatGPT 3.5. The results obtained were compared with the scores of first- to sixth-year medical students from over 120 Brazilian universities. All questions were presented sequentially, without any modification to their structure. After each question was presented, the platform's history was cleared, and the site was restarted. Results The platform achieved an average accuracy rate in 2021, 2022, and 2023 of 69.7%, 68.3%, and 67.2%, respectively, surpassing students from all medical years in the three tests evaluated, reinforcing findings in the current literature. The subject with the best score for the AI was Public Health, with a mean grade of 77.8%. Conclusions ChatGPT demonstrated the ability to answer medical questions with higher accuracy than humans, including students from the last year of medical school.
ISSN:2168-8184
2168-8184
DOI:10.7759/cureus.64924