Evaluating Large Language Models for Radiology Natural Language Processing

The rise of large language models (LLMs) has marked a pivotal shift in the field of natural language processing (NLP). LLMs have revolutionized a multitude of domains, and they have made a significant impact in the medical field. Large language models are now more abundant than ever, and many of the...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Liu, Zhengliang, Zhong, Tianyang, Li, Yiwei, Zhang, Yutong, Pan, Yi, Zhao, Zihao, Dong, Peixin, Cao, Chao, Liu, Yuxiao, Shu, Peng, Wei, Yaonai, Wu, Zihao, Ma, Chong, Wang, Jiaqi, Wang, Sheng, Zhou, Mengyue, Jiang, Zuowei, Li, Chunlin, Holmes, Jason, Xu, Shaochen, Zhang, Lu, Dai, Haixing, Zhang, Kai, Zhao, Lin, Chen, Yuanhao, Liu, Xu, Wang, Peilong, Yan, Pingkun, Liu, Jun, Ge, Bao, Sun, Lichao, Zhu, Dajiang, Li, Xiang, Liu, Wei, Cai, Xiaoyan, Hu, Xintao, Jiang, Xi, Zhang, Shu, Zhang, Xin, Zhang, Tuo, Zhao, Shijie, Li, Quanzheng, Zhu, Hongtu, Shen, Dinggang, Liu, Tianming
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page
container_issue
container_start_page
container_title
container_volume
creator Liu, Zhengliang
Zhong, Tianyang
Li, Yiwei
Zhang, Yutong
Pan, Yi
Zhao, Zihao
Dong, Peixin
Cao, Chao
Liu, Yuxiao
Shu, Peng
Wei, Yaonai
Wu, Zihao
Ma, Chong
Wang, Jiaqi
Wang, Sheng
Zhou, Mengyue
Jiang, Zuowei
Li, Chunlin
Holmes, Jason
Xu, Shaochen
Zhang, Lu
Dai, Haixing
Zhang, Kai
Zhao, Lin
Chen, Yuanhao
Liu, Xu
Wang, Peilong
Yan, Pingkun
Liu, Jun
Ge, Bao
Sun, Lichao
Zhu, Dajiang
Li, Xiang
Liu, Wei
Cai, Xiaoyan
Hu, Xintao
Jiang, Xi
Zhang, Shu
Zhang, Xin
Zhang, Tuo
Zhao, Shijie
Li, Quanzheng
Zhu, Hongtu
Shen, Dinggang
Liu, Tianming
description The rise of large language models (LLMs) has marked a pivotal shift in the field of natural language processing (NLP). LLMs have revolutionized a multitude of domains, and they have made a significant impact in the medical field. Large language models are now more abundant than ever, and many of these models exhibit bilingual capabilities, proficient in both English and Chinese. However, a comprehensive evaluation of these models remains to be conducted. This lack of assessment is especially apparent within the context of radiology NLP. This study seeks to bridge this gap by critically evaluating thirty two LLMs in interpreting radiology reports, a crucial component of radiology NLP. Specifically, the ability to derive impressions from radiologic findings is assessed. The outcomes of this evaluation provide key insights into the performance, strengths, and weaknesses of these LLMs, informing their practical applications within the medical domain.
doi_str_mv 10.48550/arxiv.2307.13693
format Article
fullrecord <record><control><sourceid>arxiv_GOX</sourceid><recordid>TN_cdi_arxiv_primary_2307_13693</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2307_13693</sourcerecordid><originalsourceid>FETCH-LOGICAL-a673-254dbad34e800dd2697c822d67b8eaeff3a0bc3cba6ca507b60e169accaa0c313</originalsourceid><addsrcrecordid>eNpFj8lqwzAYhHXJoaR5gJ7qF7Ar67cl-1hCuuEulNzNaLExKFGR45C8fd200MvMHGYGPsZucp4VVVnyO8TTcMwEcZXlJGu6Yi-bI_yEw7Dvkwaxd7Pu-wlzeA3W-THpQkw-YYfgQ39O3nCYIvx_6yMG48Zx3l-zRQc_utWfL9n2YbNdP6XN--Pz-r5JIRWloiyshqXCVZxbK2StTCWElUpXDq7rCFwbMhrSoORKS-5yWcMYgBvKacluf28vMO1XHHaI5_YHqr1A0TeHBkiO</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>Evaluating Large Language Models for Radiology Natural Language Processing</title><source>arXiv.org</source><creator>Liu, Zhengliang ; Zhong, Tianyang ; Li, Yiwei ; Zhang, Yutong ; Pan, Yi ; Zhao, Zihao ; Dong, Peixin ; Cao, Chao ; Liu, Yuxiao ; Shu, Peng ; Wei, Yaonai ; Wu, Zihao ; Ma, Chong ; Wang, Jiaqi ; Wang, Sheng ; Zhou, Mengyue ; Jiang, Zuowei ; Li, Chunlin ; Holmes, Jason ; Xu, Shaochen ; Zhang, Lu ; Dai, Haixing ; Zhang, Kai ; Zhao, Lin ; Chen, Yuanhao ; Liu, Xu ; Wang, Peilong ; Yan, Pingkun ; Liu, Jun ; Ge, Bao ; Sun, Lichao ; Zhu, Dajiang ; Li, Xiang ; Liu, Wei ; Cai, Xiaoyan ; Hu, Xintao ; Jiang, Xi ; Zhang, Shu ; Zhang, Xin ; Zhang, Tuo ; Zhao, Shijie ; Li, Quanzheng ; Zhu, Hongtu ; Shen, Dinggang ; Liu, Tianming</creator><creatorcontrib>Liu, Zhengliang ; Zhong, Tianyang ; Li, Yiwei ; Zhang, Yutong ; Pan, Yi ; Zhao, Zihao ; Dong, Peixin ; Cao, Chao ; Liu, Yuxiao ; Shu, Peng ; Wei, Yaonai ; Wu, Zihao ; Ma, Chong ; Wang, Jiaqi ; Wang, Sheng ; Zhou, Mengyue ; Jiang, Zuowei ; Li, Chunlin ; Holmes, Jason ; Xu, Shaochen ; Zhang, Lu ; Dai, Haixing ; Zhang, Kai ; Zhao, Lin ; Chen, Yuanhao ; Liu, Xu ; Wang, Peilong ; Yan, Pingkun ; Liu, Jun ; Ge, Bao ; Sun, Lichao ; Zhu, Dajiang ; Li, Xiang ; Liu, Wei ; Cai, Xiaoyan ; Hu, Xintao ; Jiang, Xi ; Zhang, Shu ; Zhang, Xin ; Zhang, Tuo ; Zhao, Shijie ; Li, Quanzheng ; Zhu, Hongtu ; Shen, Dinggang ; Liu, Tianming</creatorcontrib><description>The rise of large language models (LLMs) has marked a pivotal shift in the field of natural language processing (NLP). LLMs have revolutionized a multitude of domains, and they have made a significant impact in the medical field. Large language models are now more abundant than ever, and many of these models exhibit bilingual capabilities, proficient in both English and Chinese. However, a comprehensive evaluation of these models remains to be conducted. This lack of assessment is especially apparent within the context of radiology NLP. This study seeks to bridge this gap by critically evaluating thirty two LLMs in interpreting radiology reports, a crucial component of radiology NLP. Specifically, the ability to derive impressions from radiologic findings is assessed. The outcomes of this evaluation provide key insights into the performance, strengths, and weaknesses of these LLMs, informing their practical applications within the medical domain.</description><identifier>DOI: 10.48550/arxiv.2307.13693</identifier><language>eng</language><subject>Computer Science - Computation and Language</subject><creationdate>2023-07</creationdate><rights>http://creativecommons.org/licenses/by/4.0</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>228,230,777,882</link.rule.ids><linktorsrc>$$Uhttps://arxiv.org/abs/2307.13693$$EView_record_in_Cornell_University$$FView_record_in_$$GCornell_University$$Hfree_for_read</linktorsrc><backlink>$$Uhttps://doi.org/10.48550/arXiv.2307.13693$$DView paper in arXiv$$Hfree_for_read</backlink></links><search><creatorcontrib>Liu, Zhengliang</creatorcontrib><creatorcontrib>Zhong, Tianyang</creatorcontrib><creatorcontrib>Li, Yiwei</creatorcontrib><creatorcontrib>Zhang, Yutong</creatorcontrib><creatorcontrib>Pan, Yi</creatorcontrib><creatorcontrib>Zhao, Zihao</creatorcontrib><creatorcontrib>Dong, Peixin</creatorcontrib><creatorcontrib>Cao, Chao</creatorcontrib><creatorcontrib>Liu, Yuxiao</creatorcontrib><creatorcontrib>Shu, Peng</creatorcontrib><creatorcontrib>Wei, Yaonai</creatorcontrib><creatorcontrib>Wu, Zihao</creatorcontrib><creatorcontrib>Ma, Chong</creatorcontrib><creatorcontrib>Wang, Jiaqi</creatorcontrib><creatorcontrib>Wang, Sheng</creatorcontrib><creatorcontrib>Zhou, Mengyue</creatorcontrib><creatorcontrib>Jiang, Zuowei</creatorcontrib><creatorcontrib>Li, Chunlin</creatorcontrib><creatorcontrib>Holmes, Jason</creatorcontrib><creatorcontrib>Xu, Shaochen</creatorcontrib><creatorcontrib>Zhang, Lu</creatorcontrib><creatorcontrib>Dai, Haixing</creatorcontrib><creatorcontrib>Zhang, Kai</creatorcontrib><creatorcontrib>Zhao, Lin</creatorcontrib><creatorcontrib>Chen, Yuanhao</creatorcontrib><creatorcontrib>Liu, Xu</creatorcontrib><creatorcontrib>Wang, Peilong</creatorcontrib><creatorcontrib>Yan, Pingkun</creatorcontrib><creatorcontrib>Liu, Jun</creatorcontrib><creatorcontrib>Ge, Bao</creatorcontrib><creatorcontrib>Sun, Lichao</creatorcontrib><creatorcontrib>Zhu, Dajiang</creatorcontrib><creatorcontrib>Li, Xiang</creatorcontrib><creatorcontrib>Liu, Wei</creatorcontrib><creatorcontrib>Cai, Xiaoyan</creatorcontrib><creatorcontrib>Hu, Xintao</creatorcontrib><creatorcontrib>Jiang, Xi</creatorcontrib><creatorcontrib>Zhang, Shu</creatorcontrib><creatorcontrib>Zhang, Xin</creatorcontrib><creatorcontrib>Zhang, Tuo</creatorcontrib><creatorcontrib>Zhao, Shijie</creatorcontrib><creatorcontrib>Li, Quanzheng</creatorcontrib><creatorcontrib>Zhu, Hongtu</creatorcontrib><creatorcontrib>Shen, Dinggang</creatorcontrib><creatorcontrib>Liu, Tianming</creatorcontrib><title>Evaluating Large Language Models for Radiology Natural Language Processing</title><description>The rise of large language models (LLMs) has marked a pivotal shift in the field of natural language processing (NLP). LLMs have revolutionized a multitude of domains, and they have made a significant impact in the medical field. Large language models are now more abundant than ever, and many of these models exhibit bilingual capabilities, proficient in both English and Chinese. However, a comprehensive evaluation of these models remains to be conducted. This lack of assessment is especially apparent within the context of radiology NLP. This study seeks to bridge this gap by critically evaluating thirty two LLMs in interpreting radiology reports, a crucial component of radiology NLP. Specifically, the ability to derive impressions from radiologic findings is assessed. The outcomes of this evaluation provide key insights into the performance, strengths, and weaknesses of these LLMs, informing their practical applications within the medical domain.</description><subject>Computer Science - Computation and Language</subject><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2023</creationdate><recordtype>article</recordtype><sourceid>GOX</sourceid><recordid>eNpFj8lqwzAYhHXJoaR5gJ7qF7Ar67cl-1hCuuEulNzNaLExKFGR45C8fd200MvMHGYGPsZucp4VVVnyO8TTcMwEcZXlJGu6Yi-bI_yEw7Dvkwaxd7Pu-wlzeA3W-THpQkw-YYfgQ39O3nCYIvx_6yMG48Zx3l-zRQc_utWfL9n2YbNdP6XN--Pz-r5JIRWloiyshqXCVZxbK2StTCWElUpXDq7rCFwbMhrSoORKS-5yWcMYgBvKacluf28vMO1XHHaI5_YHqr1A0TeHBkiO</recordid><startdate>20230725</startdate><enddate>20230725</enddate><creator>Liu, Zhengliang</creator><creator>Zhong, Tianyang</creator><creator>Li, Yiwei</creator><creator>Zhang, Yutong</creator><creator>Pan, Yi</creator><creator>Zhao, Zihao</creator><creator>Dong, Peixin</creator><creator>Cao, Chao</creator><creator>Liu, Yuxiao</creator><creator>Shu, Peng</creator><creator>Wei, Yaonai</creator><creator>Wu, Zihao</creator><creator>Ma, Chong</creator><creator>Wang, Jiaqi</creator><creator>Wang, Sheng</creator><creator>Zhou, Mengyue</creator><creator>Jiang, Zuowei</creator><creator>Li, Chunlin</creator><creator>Holmes, Jason</creator><creator>Xu, Shaochen</creator><creator>Zhang, Lu</creator><creator>Dai, Haixing</creator><creator>Zhang, Kai</creator><creator>Zhao, Lin</creator><creator>Chen, Yuanhao</creator><creator>Liu, Xu</creator><creator>Wang, Peilong</creator><creator>Yan, Pingkun</creator><creator>Liu, Jun</creator><creator>Ge, Bao</creator><creator>Sun, Lichao</creator><creator>Zhu, Dajiang</creator><creator>Li, Xiang</creator><creator>Liu, Wei</creator><creator>Cai, Xiaoyan</creator><creator>Hu, Xintao</creator><creator>Jiang, Xi</creator><creator>Zhang, Shu</creator><creator>Zhang, Xin</creator><creator>Zhang, Tuo</creator><creator>Zhao, Shijie</creator><creator>Li, Quanzheng</creator><creator>Zhu, Hongtu</creator><creator>Shen, Dinggang</creator><creator>Liu, Tianming</creator><scope>AKY</scope><scope>GOX</scope></search><sort><creationdate>20230725</creationdate><title>Evaluating Large Language Models for Radiology Natural Language Processing</title><author>Liu, Zhengliang ; Zhong, Tianyang ; Li, Yiwei ; Zhang, Yutong ; Pan, Yi ; Zhao, Zihao ; Dong, Peixin ; Cao, Chao ; Liu, Yuxiao ; Shu, Peng ; Wei, Yaonai ; Wu, Zihao ; Ma, Chong ; Wang, Jiaqi ; Wang, Sheng ; Zhou, Mengyue ; Jiang, Zuowei ; Li, Chunlin ; Holmes, Jason ; Xu, Shaochen ; Zhang, Lu ; Dai, Haixing ; Zhang, Kai ; Zhao, Lin ; Chen, Yuanhao ; Liu, Xu ; Wang, Peilong ; Yan, Pingkun ; Liu, Jun ; Ge, Bao ; Sun, Lichao ; Zhu, Dajiang ; Li, Xiang ; Liu, Wei ; Cai, Xiaoyan ; Hu, Xintao ; Jiang, Xi ; Zhang, Shu ; Zhang, Xin ; Zhang, Tuo ; Zhao, Shijie ; Li, Quanzheng ; Zhu, Hongtu ; Shen, Dinggang ; Liu, Tianming</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-a673-254dbad34e800dd2697c822d67b8eaeff3a0bc3cba6ca507b60e169accaa0c313</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2023</creationdate><topic>Computer Science - Computation and Language</topic><toplevel>online_resources</toplevel><creatorcontrib>Liu, Zhengliang</creatorcontrib><creatorcontrib>Zhong, Tianyang</creatorcontrib><creatorcontrib>Li, Yiwei</creatorcontrib><creatorcontrib>Zhang, Yutong</creatorcontrib><creatorcontrib>Pan, Yi</creatorcontrib><creatorcontrib>Zhao, Zihao</creatorcontrib><creatorcontrib>Dong, Peixin</creatorcontrib><creatorcontrib>Cao, Chao</creatorcontrib><creatorcontrib>Liu, Yuxiao</creatorcontrib><creatorcontrib>Shu, Peng</creatorcontrib><creatorcontrib>Wei, Yaonai</creatorcontrib><creatorcontrib>Wu, Zihao</creatorcontrib><creatorcontrib>Ma, Chong</creatorcontrib><creatorcontrib>Wang, Jiaqi</creatorcontrib><creatorcontrib>Wang, Sheng</creatorcontrib><creatorcontrib>Zhou, Mengyue</creatorcontrib><creatorcontrib>Jiang, Zuowei</creatorcontrib><creatorcontrib>Li, Chunlin</creatorcontrib><creatorcontrib>Holmes, Jason</creatorcontrib><creatorcontrib>Xu, Shaochen</creatorcontrib><creatorcontrib>Zhang, Lu</creatorcontrib><creatorcontrib>Dai, Haixing</creatorcontrib><creatorcontrib>Zhang, Kai</creatorcontrib><creatorcontrib>Zhao, Lin</creatorcontrib><creatorcontrib>Chen, Yuanhao</creatorcontrib><creatorcontrib>Liu, Xu</creatorcontrib><creatorcontrib>Wang, Peilong</creatorcontrib><creatorcontrib>Yan, Pingkun</creatorcontrib><creatorcontrib>Liu, Jun</creatorcontrib><creatorcontrib>Ge, Bao</creatorcontrib><creatorcontrib>Sun, Lichao</creatorcontrib><creatorcontrib>Zhu, Dajiang</creatorcontrib><creatorcontrib>Li, Xiang</creatorcontrib><creatorcontrib>Liu, Wei</creatorcontrib><creatorcontrib>Cai, Xiaoyan</creatorcontrib><creatorcontrib>Hu, Xintao</creatorcontrib><creatorcontrib>Jiang, Xi</creatorcontrib><creatorcontrib>Zhang, Shu</creatorcontrib><creatorcontrib>Zhang, Xin</creatorcontrib><creatorcontrib>Zhang, Tuo</creatorcontrib><creatorcontrib>Zhao, Shijie</creatorcontrib><creatorcontrib>Li, Quanzheng</creatorcontrib><creatorcontrib>Zhu, Hongtu</creatorcontrib><creatorcontrib>Shen, Dinggang</creatorcontrib><creatorcontrib>Liu, Tianming</creatorcontrib><collection>arXiv Computer Science</collection><collection>arXiv.org</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Liu, Zhengliang</au><au>Zhong, Tianyang</au><au>Li, Yiwei</au><au>Zhang, Yutong</au><au>Pan, Yi</au><au>Zhao, Zihao</au><au>Dong, Peixin</au><au>Cao, Chao</au><au>Liu, Yuxiao</au><au>Shu, Peng</au><au>Wei, Yaonai</au><au>Wu, Zihao</au><au>Ma, Chong</au><au>Wang, Jiaqi</au><au>Wang, Sheng</au><au>Zhou, Mengyue</au><au>Jiang, Zuowei</au><au>Li, Chunlin</au><au>Holmes, Jason</au><au>Xu, Shaochen</au><au>Zhang, Lu</au><au>Dai, Haixing</au><au>Zhang, Kai</au><au>Zhao, Lin</au><au>Chen, Yuanhao</au><au>Liu, Xu</au><au>Wang, Peilong</au><au>Yan, Pingkun</au><au>Liu, Jun</au><au>Ge, Bao</au><au>Sun, Lichao</au><au>Zhu, Dajiang</au><au>Li, Xiang</au><au>Liu, Wei</au><au>Cai, Xiaoyan</au><au>Hu, Xintao</au><au>Jiang, Xi</au><au>Zhang, Shu</au><au>Zhang, Xin</au><au>Zhang, Tuo</au><au>Zhao, Shijie</au><au>Li, Quanzheng</au><au>Zhu, Hongtu</au><au>Shen, Dinggang</au><au>Liu, Tianming</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Evaluating Large Language Models for Radiology Natural Language Processing</atitle><date>2023-07-25</date><risdate>2023</risdate><abstract>The rise of large language models (LLMs) has marked a pivotal shift in the field of natural language processing (NLP). LLMs have revolutionized a multitude of domains, and they have made a significant impact in the medical field. Large language models are now more abundant than ever, and many of these models exhibit bilingual capabilities, proficient in both English and Chinese. However, a comprehensive evaluation of these models remains to be conducted. This lack of assessment is especially apparent within the context of radiology NLP. This study seeks to bridge this gap by critically evaluating thirty two LLMs in interpreting radiology reports, a crucial component of radiology NLP. Specifically, the ability to derive impressions from radiologic findings is assessed. The outcomes of this evaluation provide key insights into the performance, strengths, and weaknesses of these LLMs, informing their practical applications within the medical domain.</abstract><doi>10.48550/arxiv.2307.13693</doi><oa>free_for_read</oa></addata></record>
fulltext fulltext_linktorsrc
identifier DOI: 10.48550/arxiv.2307.13693
ispartof
issn
language eng
recordid cdi_arxiv_primary_2307_13693
source arXiv.org
subjects Computer Science - Computation and Language
title Evaluating Large Language Models for Radiology Natural Language Processing
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-20T08%3A32%3A53IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-arxiv_GOX&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Evaluating%20Large%20Language%20Models%20for%20Radiology%20Natural%20Language%20Processing&rft.au=Liu,%20Zhengliang&rft.date=2023-07-25&rft_id=info:doi/10.48550/arxiv.2307.13693&rft_dat=%3Carxiv_GOX%3E2307_13693%3C/arxiv_GOX%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true