End-To-End Clinical Trial Matching with Large Language Models

Matching cancer patients to clinical trials is essential for advancing treatment and patient care. However, the inconsistent format of medical free text documents and complex trial eligibility criteria make this process extremely challenging and time-consuming for physicians. We investigated whether...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	arXiv.org 2024-07
Hauptverfasser:	Ferber, Dyke, Hilgers, Lars, Wiest, Isabella C, Marie-Elisabeth Leßmann, Clusmann, Jan, Neidlinger, Peter, Zhu, Jiefu, Wölflein, Georg, Lammert, Jacqueline, Tschochohei, Maximilian, Böhme, Heiko, Jäger, Dirk, Aldea, Mihaela, Truhn, Daniel, Höper, Christiane, Kather, Jakob Nikolas
Format:	Artikel
Sprache:	eng
Schlagworte:	Accuracy Clinical trials Criteria Electronic health records Large language models Matching Medical personnel Oncology Patients
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page
container_issue
container_start_page
container_title	arXiv.org
container_volume
creator	Ferber, Dyke Hilgers, Lars Wiest, Isabella C Marie-Elisabeth Leßmann Clusmann, Jan Neidlinger, Peter Zhu, Jiefu Wölflein, Georg Lammert, Jacqueline Tschochohei, Maximilian Böhme, Heiko Jäger, Dirk Aldea, Mihaela Truhn, Daniel Höper, Christiane Kather, Jakob Nikolas
description	Matching cancer patients to clinical trials is essential for advancing treatment and patient care. However, the inconsistent format of medical free text documents and complex trial eligibility criteria make this process extremely challenging and time-consuming for physicians. We investigated whether the entire trial matching process - from identifying relevant trials among 105,600 oncology-related clinical trials on clinicaltrials.gov to generating criterion-level eligibility matches - could be automated using Large Language Models (LLMs). Using GPT-4o and a set of 51 synthetic Electronic Health Records (EHRs), we demonstrate that our approach identifies relevant candidate trials in 93.3% of cases and achieves a preliminary accuracy of 88.0% when matching patient-level information at the criterion level against a baseline defined by human experts. Utilizing LLM feedback reveals that 39.3% criteria that were initially considered incorrect are either ambiguous or inaccurately annotated, leading to a total model accuracy of 92.7% after refining our human baseline. In summary, we present an end-to-end pipeline for clinical trial matching using LLMs, demonstrating high precision in screening and matching trials to individual patients, even outperforming the performance of qualified medical doctors. Our fully end-to-end pipeline can operate autonomously or with human supervision and is not restricted to oncology, offering a scalable solution for enhancing patient-trial matching in real-world settings.
format	Article
fullrecord	<record><control><sourceid>proquest</sourceid><recordid>TN_cdi_proquest_journals_3082705562</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>3082705562</sourcerecordid><originalsourceid>FETCH-proquest_journals_30827055623</originalsourceid><addsrcrecordid>eNpjYuA0MjY21LUwMTLiYOAtLs4yMDAwMjM3MjU15mSwdc1L0Q3J1wVSCs45mXmZyYk5CiFFmUDSN7EkOSMzL12hPLMkQ8EnsSg9FUjmpZcmAhm--SmpOcU8DKxpiTnFqbxQmptB2c01xNlDt6Aov7A0tbgkPiu_tCgPKBVvbGBhZG5gampmZEycKgDVGDau</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>3082705562</pqid></control><display><type>article</type><title>End-To-End Clinical Trial Matching with Large Language Models</title><source>Free E- Journals</source><creator>Ferber, Dyke ; Hilgers, Lars ; Wiest, Isabella C ; Marie-Elisabeth Leßmann ; Clusmann, Jan ; Neidlinger, Peter ; Zhu, Jiefu ; Wölflein, Georg ; Lammert, Jacqueline ; Tschochohei, Maximilian ; Böhme, Heiko ; Jäger, Dirk ; Aldea, Mihaela ; Truhn, Daniel ; Höper, Christiane ; Kather, Jakob Nikolas</creator><creatorcontrib>Ferber, Dyke ; Hilgers, Lars ; Wiest, Isabella C ; Marie-Elisabeth Leßmann ; Clusmann, Jan ; Neidlinger, Peter ; Zhu, Jiefu ; Wölflein, Georg ; Lammert, Jacqueline ; Tschochohei, Maximilian ; Böhme, Heiko ; Jäger, Dirk ; Aldea, Mihaela ; Truhn, Daniel ; Höper, Christiane ; Kather, Jakob Nikolas</creatorcontrib><description>Matching cancer patients to clinical trials is essential for advancing treatment and patient care. However, the inconsistent format of medical free text documents and complex trial eligibility criteria make this process extremely challenging and time-consuming for physicians. We investigated whether the entire trial matching process - from identifying relevant trials among 105,600 oncology-related clinical trials on clinicaltrials.gov to generating criterion-level eligibility matches - could be automated using Large Language Models (LLMs). Using GPT-4o and a set of 51 synthetic Electronic Health Records (EHRs), we demonstrate that our approach identifies relevant candidate trials in 93.3% of cases and achieves a preliminary accuracy of 88.0% when matching patient-level information at the criterion level against a baseline defined by human experts. Utilizing LLM feedback reveals that 39.3% criteria that were initially considered incorrect are either ambiguous or inaccurately annotated, leading to a total model accuracy of 92.7% after refining our human baseline. In summary, we present an end-to-end pipeline for clinical trial matching using LLMs, demonstrating high precision in screening and matching trials to individual patients, even outperforming the performance of qualified medical doctors. Our fully end-to-end pipeline can operate autonomously or with human supervision and is not restricted to oncology, offering a scalable solution for enhancing patient-trial matching in real-world settings.</description><identifier>EISSN: 2331-8422</identifier><language>eng</language><publisher>Ithaca: Cornell University Library, arXiv.org</publisher><subject>Accuracy ; Clinical trials ; Criteria ; Electronic health records ; Large language models ; Matching ; Medical personnel ; Oncology ; Patients</subject><ispartof>arXiv.org, 2024-07</ispartof><rights>2024. This work is published under http://creativecommons.org/licenses/by-nc-sa/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>780,784</link.rule.ids></links><search><creatorcontrib>Ferber, Dyke</creatorcontrib><creatorcontrib>Hilgers, Lars</creatorcontrib><creatorcontrib>Wiest, Isabella C</creatorcontrib><creatorcontrib>Marie-Elisabeth Leßmann</creatorcontrib><creatorcontrib>Clusmann, Jan</creatorcontrib><creatorcontrib>Neidlinger, Peter</creatorcontrib><creatorcontrib>Zhu, Jiefu</creatorcontrib><creatorcontrib>Wölflein, Georg</creatorcontrib><creatorcontrib>Lammert, Jacqueline</creatorcontrib><creatorcontrib>Tschochohei, Maximilian</creatorcontrib><creatorcontrib>Böhme, Heiko</creatorcontrib><creatorcontrib>Jäger, Dirk</creatorcontrib><creatorcontrib>Aldea, Mihaela</creatorcontrib><creatorcontrib>Truhn, Daniel</creatorcontrib><creatorcontrib>Höper, Christiane</creatorcontrib><creatorcontrib>Kather, Jakob Nikolas</creatorcontrib><title>End-To-End Clinical Trial Matching with Large Language Models</title><title>arXiv.org</title><description>Matching cancer patients to clinical trials is essential for advancing treatment and patient care. However, the inconsistent format of medical free text documents and complex trial eligibility criteria make this process extremely challenging and time-consuming for physicians. We investigated whether the entire trial matching process - from identifying relevant trials among 105,600 oncology-related clinical trials on clinicaltrials.gov to generating criterion-level eligibility matches - could be automated using Large Language Models (LLMs). Using GPT-4o and a set of 51 synthetic Electronic Health Records (EHRs), we demonstrate that our approach identifies relevant candidate trials in 93.3% of cases and achieves a preliminary accuracy of 88.0% when matching patient-level information at the criterion level against a baseline defined by human experts. Utilizing LLM feedback reveals that 39.3% criteria that were initially considered incorrect are either ambiguous or inaccurately annotated, leading to a total model accuracy of 92.7% after refining our human baseline. In summary, we present an end-to-end pipeline for clinical trial matching using LLMs, demonstrating high precision in screening and matching trials to individual patients, even outperforming the performance of qualified medical doctors. Our fully end-to-end pipeline can operate autonomously or with human supervision and is not restricted to oncology, offering a scalable solution for enhancing patient-trial matching in real-world settings.</description><subject>Accuracy</subject><subject>Clinical trials</subject><subject>Criteria</subject><subject>Electronic health records</subject><subject>Large language models</subject><subject>Matching</subject><subject>Medical personnel</subject><subject>Oncology</subject><subject>Patients</subject><issn>2331-8422</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><sourceid>ABUWG</sourceid><sourceid>AFKRA</sourceid><sourceid>AZQEC</sourceid><sourceid>BENPR</sourceid><sourceid>CCPQU</sourceid><sourceid>DWQXO</sourceid><recordid>eNpjYuA0MjY21LUwMTLiYOAtLs4yMDAwMjM3MjU15mSwdc1L0Q3J1wVSCs45mXmZyYk5CiFFmUDSN7EkOSMzL12hPLMkQ8EnsSg9FUjmpZcmAhm--SmpOcU8DKxpiTnFqbxQmptB2c01xNlDt6Aov7A0tbgkPiu_tCgPKBVvbGBhZG5gampmZEycKgDVGDau</recordid><startdate>20240718</startdate><enddate>20240718</enddate><creator>Ferber, Dyke</creator><creator>Hilgers, Lars</creator><creator>Wiest, Isabella C</creator><creator>Marie-Elisabeth Leßmann</creator><creator>Clusmann, Jan</creator><creator>Neidlinger, Peter</creator><creator>Zhu, Jiefu</creator><creator>Wölflein, Georg</creator><creator>Lammert, Jacqueline</creator><creator>Tschochohei, Maximilian</creator><creator>Böhme, Heiko</creator><creator>Jäger, Dirk</creator><creator>Aldea, Mihaela</creator><creator>Truhn, Daniel</creator><creator>Höper, Christiane</creator><creator>Kather, Jakob Nikolas</creator><general>Cornell University Library, arXiv.org</general><scope>8FE</scope><scope>8FG</scope><scope>ABJCF</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>HCIFZ</scope><scope>L6V</scope><scope>M7S</scope><scope>PIMPY</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope><scope>PTHSS</scope></search><sort><creationdate>20240718</creationdate><title>End-To-End Clinical Trial Matching with Large Language Models</title><author>Ferber, Dyke ; Hilgers, Lars ; Wiest, Isabella C ; Marie-Elisabeth Leßmann ; Clusmann, Jan ; Neidlinger, Peter ; Zhu, Jiefu ; Wölflein, Georg ; Lammert, Jacqueline ; Tschochohei, Maximilian ; Böhme, Heiko ; Jäger, Dirk ; Aldea, Mihaela ; Truhn, Daniel ; Höper, Christiane ; Kather, Jakob Nikolas</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-proquest_journals_30827055623</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Accuracy</topic><topic>Clinical trials</topic><topic>Criteria</topic><topic>Electronic health records</topic><topic>Large language models</topic><topic>Matching</topic><topic>Medical personnel</topic><topic>Oncology</topic><topic>Patients</topic><toplevel>online_resources</toplevel><creatorcontrib>Ferber, Dyke</creatorcontrib><creatorcontrib>Hilgers, Lars</creatorcontrib><creatorcontrib>Wiest, Isabella C</creatorcontrib><creatorcontrib>Marie-Elisabeth Leßmann</creatorcontrib><creatorcontrib>Clusmann, Jan</creatorcontrib><creatorcontrib>Neidlinger, Peter</creatorcontrib><creatorcontrib>Zhu, Jiefu</creatorcontrib><creatorcontrib>Wölflein, Georg</creatorcontrib><creatorcontrib>Lammert, Jacqueline</creatorcontrib><creatorcontrib>Tschochohei, Maximilian</creatorcontrib><creatorcontrib>Böhme, Heiko</creatorcontrib><creatorcontrib>Jäger, Dirk</creatorcontrib><creatorcontrib>Aldea, Mihaela</creatorcontrib><creatorcontrib>Truhn, Daniel</creatorcontrib><creatorcontrib>Höper, Christiane</creatorcontrib><creatorcontrib>Kather, Jakob Nikolas</creatorcontrib><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>Materials Science & Engineering Collection</collection><collection>ProQuest Central (Alumni)</collection><collection>ProQuest Central</collection><collection>ProQuest Central Essentials</collection><collection>ProQuest Central</collection><collection>Technology Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central Korea</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Engineering Collection</collection><collection>Engineering Database</collection><collection>ProQuest - Publicly Available Content Database</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><collection>Engineering collection</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Ferber, Dyke</au><au>Hilgers, Lars</au><au>Wiest, Isabella C</au><au>Marie-Elisabeth Leßmann</au><au>Clusmann, Jan</au><au>Neidlinger, Peter</au><au>Zhu, Jiefu</au><au>Wölflein, Georg</au><au>Lammert, Jacqueline</au><au>Tschochohei, Maximilian</au><au>Böhme, Heiko</au><au>Jäger, Dirk</au><au>Aldea, Mihaela</au><au>Truhn, Daniel</au><au>Höper, Christiane</au><au>Kather, Jakob Nikolas</au><format>book</format><genre>document</genre><ristype>GEN</ristype><atitle>End-To-End Clinical Trial Matching with Large Language Models</atitle><jtitle>arXiv.org</jtitle><date>2024-07-18</date><risdate>2024</risdate><eissn>2331-8422</eissn><abstract>Matching cancer patients to clinical trials is essential for advancing treatment and patient care. However, the inconsistent format of medical free text documents and complex trial eligibility criteria make this process extremely challenging and time-consuming for physicians. We investigated whether the entire trial matching process - from identifying relevant trials among 105,600 oncology-related clinical trials on clinicaltrials.gov to generating criterion-level eligibility matches - could be automated using Large Language Models (LLMs). Using GPT-4o and a set of 51 synthetic Electronic Health Records (EHRs), we demonstrate that our approach identifies relevant candidate trials in 93.3% of cases and achieves a preliminary accuracy of 88.0% when matching patient-level information at the criterion level against a baseline defined by human experts. Utilizing LLM feedback reveals that 39.3% criteria that were initially considered incorrect are either ambiguous or inaccurately annotated, leading to a total model accuracy of 92.7% after refining our human baseline. In summary, we present an end-to-end pipeline for clinical trial matching using LLMs, demonstrating high precision in screening and matching trials to individual patients, even outperforming the performance of qualified medical doctors. Our fully end-to-end pipeline can operate autonomously or with human supervision and is not restricted to oncology, offering a scalable solution for enhancing patient-trial matching in real-world settings.</abstract><cop>Ithaca</cop><pub>Cornell University Library, arXiv.org</pub><oa>free_for_read</oa></addata></record>
fulltext	fulltext
identifier	EISSN: 2331-8422
ispartof	arXiv.org, 2024-07
issn	2331-8422
language	eng
recordid	cdi_proquest_journals_3082705562
source	Free E- Journals
subjects	Accuracy Clinical trials Criteria Electronic health records Large language models Matching Medical personnel Oncology Patients
title	End-To-End Clinical Trial Matching with Large Language Models
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-01T08%3A02%3A53IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=document&rft.atitle=End-To-End%20Clinical%20Trial%20Matching%20with%20Large%20Language%20Models&rft.jtitle=arXiv.org&rft.au=Ferber,%20Dyke&rft.date=2024-07-18&rft.eissn=2331-8422&rft_id=info:doi/&rft_dat=%3Cproquest%3E3082705562%3C/proquest%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=3082705562&rft_id=info:pmid/&rfr_iscdi=true