Improving Patient Pre-screening for Clinical Trials: Assisting Physicians with Large Language Models

Physicians considering clinical trials for their patients are met with the laborious process of checking many text based eligibility criteria. Large Language Models (LLMs) have shown to perform well for clinical information extraction and clinical reasoning, including medical tests, but not yet in r...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Hauptverfasser:	Hamer, Danny M. den, Schoor, Perry, Polak, Tobias B, Kapitan, Daniel
Format:	Artikel
Sprache:	eng
Schlagworte:	Computer Science - Artificial Intelligence Computer Science - Computation and Language Computer Science - Learning
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page
container_issue
container_start_page
container_title
container_volume
creator	Hamer, Danny M. den Schoor, Perry Polak, Tobias B Kapitan, Daniel
description	Physicians considering clinical trials for their patients are met with the laborious process of checking many text based eligibility criteria. Large Language Models (LLMs) have shown to perform well for clinical information extraction and clinical reasoning, including medical tests, but not yet in real-world scenarios. This paper investigates the use of InstructGPT to assist physicians in determining eligibility for clinical trials based on a patient's summarised medical profile. Using a prompting strategy combining one-shot, selection-inference and chain-of-thought techniques, we investigate the performance of LLMs on 10 synthetically created patient profiles. Performance is evaluated at four levels: ability to identify screenable eligibility criteria from a trial given a medical profile; ability to classify for each individual criterion whether the patient qualifies; the overall classification whether a patient is eligible for a clinical trial and the percentage of criteria to be screened by physician. We evaluated against 146 clinical trials and a total of 4,135 eligibility criteria. The LLM was able to correctly identify the screenability of 72% (2,994/4,135) of the criteria. Additionally, 72% (341/471) of the screenable criteria were evaluated correctly. The resulting trial level classification as eligible or ineligible resulted in a recall of 0.5. By leveraging LLMs with a physician-in-the-loop, a recall of 1.0 and precision of 0.71 on clinical trial level can be achieved while reducing the amount of criteria to be checked by an estimated 90%. LLMs can be used to assist physicians with pre-screening of patients for clinical trials. By forcing instruction-tuned LLMs to produce chain-of-thought responses, the reasoning can be made transparent to and the decision process becomes amenable by physicians, thereby making such a system feasible for use in real-world scenarios.
doi_str_mv	10.48550/arxiv.2304.07396
format	Article
fullrecord	<record><control><sourceid>arxiv_GOX</sourceid><recordid>TN_cdi_arxiv_primary_2304_07396</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2304_07396</sourcerecordid><originalsourceid>FETCH-LOGICAL-a676-3ae62a1252299297227fb455f6d65ef8b1db496efa50f3bf1e7260ff29c8cc683</originalsourceid><addsrcrecordid>eNotj01ugzAUhL3pokp7gK7qC0CNjQ3uLkL9iUTVLNijh7HJk4iJbJo2ty-h3cyMRpqRPkIeMpbmpZTsCcIPnlMuWJ6yQmh1S_rd8RSmM_qB7mFG62e6DzaJJljrr62bAq1G9GhgpE1AGOMz3caIcV5Hh0tEg-Aj_cb5QGsIg13UD1-whI-pt2O8Izdu2dn7f9-Q5vWlqd6T-vNtV23rBFShEgFWcci45FxrrgvOC9flUjrVK2ld2WV9l2tlHUjmROcyW3DFnOPalMaoUmzI49_titmeAh4hXNorbrviil9nM1EZ</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>Improving Patient Pre-screening for Clinical Trials: Assisting Physicians with Large Language Models</title><source>arXiv.org</source><creator>Hamer, Danny M. den ; Schoor, Perry ; Polak, Tobias B ; Kapitan, Daniel</creator><creatorcontrib>Hamer, Danny M. den ; Schoor, Perry ; Polak, Tobias B ; Kapitan, Daniel</creatorcontrib><description>Physicians considering clinical trials for their patients are met with the laborious process of checking many text based eligibility criteria. Large Language Models (LLMs) have shown to perform well for clinical information extraction and clinical reasoning, including medical tests, but not yet in real-world scenarios. This paper investigates the use of InstructGPT to assist physicians in determining eligibility for clinical trials based on a patient's summarised medical profile. Using a prompting strategy combining one-shot, selection-inference and chain-of-thought techniques, we investigate the performance of LLMs on 10 synthetically created patient profiles. Performance is evaluated at four levels: ability to identify screenable eligibility criteria from a trial given a medical profile; ability to classify for each individual criterion whether the patient qualifies; the overall classification whether a patient is eligible for a clinical trial and the percentage of criteria to be screened by physician. We evaluated against 146 clinical trials and a total of 4,135 eligibility criteria. The LLM was able to correctly identify the screenability of 72% (2,994/4,135) of the criteria. Additionally, 72% (341/471) of the screenable criteria were evaluated correctly. The resulting trial level classification as eligible or ineligible resulted in a recall of 0.5. By leveraging LLMs with a physician-in-the-loop, a recall of 1.0 and precision of 0.71 on clinical trial level can be achieved while reducing the amount of criteria to be checked by an estimated 90%. LLMs can be used to assist physicians with pre-screening of patients for clinical trials. By forcing instruction-tuned LLMs to produce chain-of-thought responses, the reasoning can be made transparent to and the decision process becomes amenable by physicians, thereby making such a system feasible for use in real-world scenarios.</description><identifier>DOI: 10.48550/arxiv.2304.07396</identifier><language>eng</language><subject>Computer Science - Artificial Intelligence ; Computer Science - Computation and Language ; Computer Science - Learning</subject><creationdate>2023-04</creationdate><rights>http://creativecommons.org/licenses/by/4.0</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>228,230,780,885</link.rule.ids><linktorsrc>$$Uhttps://arxiv.org/abs/2304.07396$$EView_record_in_Cornell_University$$FView_record_in_$$GCornell_University$$Hfree_for_read</linktorsrc><backlink>$$Uhttps://doi.org/10.48550/arXiv.2304.07396$$DView paper in arXiv$$Hfree_for_read</backlink></links><search><creatorcontrib>Hamer, Danny M. den</creatorcontrib><creatorcontrib>Schoor, Perry</creatorcontrib><creatorcontrib>Polak, Tobias B</creatorcontrib><creatorcontrib>Kapitan, Daniel</creatorcontrib><title>Improving Patient Pre-screening for Clinical Trials: Assisting Physicians with Large Language Models</title><description>Physicians considering clinical trials for their patients are met with the laborious process of checking many text based eligibility criteria. Large Language Models (LLMs) have shown to perform well for clinical information extraction and clinical reasoning, including medical tests, but not yet in real-world scenarios. This paper investigates the use of InstructGPT to assist physicians in determining eligibility for clinical trials based on a patient's summarised medical profile. Using a prompting strategy combining one-shot, selection-inference and chain-of-thought techniques, we investigate the performance of LLMs on 10 synthetically created patient profiles. Performance is evaluated at four levels: ability to identify screenable eligibility criteria from a trial given a medical profile; ability to classify for each individual criterion whether the patient qualifies; the overall classification whether a patient is eligible for a clinical trial and the percentage of criteria to be screened by physician. We evaluated against 146 clinical trials and a total of 4,135 eligibility criteria. The LLM was able to correctly identify the screenability of 72% (2,994/4,135) of the criteria. Additionally, 72% (341/471) of the screenable criteria were evaluated correctly. The resulting trial level classification as eligible or ineligible resulted in a recall of 0.5. By leveraging LLMs with a physician-in-the-loop, a recall of 1.0 and precision of 0.71 on clinical trial level can be achieved while reducing the amount of criteria to be checked by an estimated 90%. LLMs can be used to assist physicians with pre-screening of patients for clinical trials. By forcing instruction-tuned LLMs to produce chain-of-thought responses, the reasoning can be made transparent to and the decision process becomes amenable by physicians, thereby making such a system feasible for use in real-world scenarios.</description><subject>Computer Science - Artificial Intelligence</subject><subject>Computer Science - Computation and Language</subject><subject>Computer Science - Learning</subject><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2023</creationdate><recordtype>article</recordtype><sourceid>GOX</sourceid><recordid>eNotj01ugzAUhL3pokp7gK7qC0CNjQ3uLkL9iUTVLNijh7HJk4iJbJo2ty-h3cyMRpqRPkIeMpbmpZTsCcIPnlMuWJ6yQmh1S_rd8RSmM_qB7mFG62e6DzaJJljrr62bAq1G9GhgpE1AGOMz3caIcV5Hh0tEg-Aj_cb5QGsIg13UD1-whI-pt2O8Izdu2dn7f9-Q5vWlqd6T-vNtV23rBFShEgFWcci45FxrrgvOC9flUjrVK2ld2WV9l2tlHUjmROcyW3DFnOPalMaoUmzI49_titmeAh4hXNorbrviil9nM1EZ</recordid><startdate>20230414</startdate><enddate>20230414</enddate><creator>Hamer, Danny M. den</creator><creator>Schoor, Perry</creator><creator>Polak, Tobias B</creator><creator>Kapitan, Daniel</creator><scope>AKY</scope><scope>GOX</scope></search><sort><creationdate>20230414</creationdate><title>Improving Patient Pre-screening for Clinical Trials: Assisting Physicians with Large Language Models</title><author>Hamer, Danny M. den ; Schoor, Perry ; Polak, Tobias B ; Kapitan, Daniel</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-a676-3ae62a1252299297227fb455f6d65ef8b1db496efa50f3bf1e7260ff29c8cc683</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2023</creationdate><topic>Computer Science - Artificial Intelligence</topic><topic>Computer Science - Computation and Language</topic><topic>Computer Science - Learning</topic><toplevel>online_resources</toplevel><creatorcontrib>Hamer, Danny M. den</creatorcontrib><creatorcontrib>Schoor, Perry</creatorcontrib><creatorcontrib>Polak, Tobias B</creatorcontrib><creatorcontrib>Kapitan, Daniel</creatorcontrib><collection>arXiv Computer Science</collection><collection>arXiv.org</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Hamer, Danny M. den</au><au>Schoor, Perry</au><au>Polak, Tobias B</au><au>Kapitan, Daniel</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Improving Patient Pre-screening for Clinical Trials: Assisting Physicians with Large Language Models</atitle><date>2023-04-14</date><risdate>2023</risdate><abstract>Physicians considering clinical trials for their patients are met with the laborious process of checking many text based eligibility criteria. Large Language Models (LLMs) have shown to perform well for clinical information extraction and clinical reasoning, including medical tests, but not yet in real-world scenarios. This paper investigates the use of InstructGPT to assist physicians in determining eligibility for clinical trials based on a patient's summarised medical profile. Using a prompting strategy combining one-shot, selection-inference and chain-of-thought techniques, we investigate the performance of LLMs on 10 synthetically created patient profiles. Performance is evaluated at four levels: ability to identify screenable eligibility criteria from a trial given a medical profile; ability to classify for each individual criterion whether the patient qualifies; the overall classification whether a patient is eligible for a clinical trial and the percentage of criteria to be screened by physician. We evaluated against 146 clinical trials and a total of 4,135 eligibility criteria. The LLM was able to correctly identify the screenability of 72% (2,994/4,135) of the criteria. Additionally, 72% (341/471) of the screenable criteria were evaluated correctly. The resulting trial level classification as eligible or ineligible resulted in a recall of 0.5. By leveraging LLMs with a physician-in-the-loop, a recall of 1.0 and precision of 0.71 on clinical trial level can be achieved while reducing the amount of criteria to be checked by an estimated 90%. LLMs can be used to assist physicians with pre-screening of patients for clinical trials. By forcing instruction-tuned LLMs to produce chain-of-thought responses, the reasoning can be made transparent to and the decision process becomes amenable by physicians, thereby making such a system feasible for use in real-world scenarios.</abstract><doi>10.48550/arxiv.2304.07396</doi><oa>free_for_read</oa></addata></record>
fulltext	fulltext_linktorsrc
identifier	DOI: 10.48550/arxiv.2304.07396
ispartof
issn
language	eng
recordid	cdi_arxiv_primary_2304_07396
source	arXiv.org
subjects	Computer Science - Artificial Intelligence Computer Science - Computation and Language Computer Science - Learning
title	Improving Patient Pre-screening for Clinical Trials: Assisting Physicians with Large Language Models
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-24T17%3A16%3A01IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-arxiv_GOX&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Improving%20Patient%20Pre-screening%20for%20Clinical%20Trials:%20Assisting%20Physicians%20with%20Large%20Language%20Models&rft.au=Hamer,%20Danny%20M.%20den&rft.date=2023-04-14&rft_id=info:doi/10.48550/arxiv.2304.07396&rft_dat=%3Carxiv_GOX%3E2304_07396%3C/arxiv_GOX%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true