Towards Foundation Models for Critical Care Time Series

Notable progress has been made in generalist medical large language models across various healthcare areas. However, large-scale modeling of in-hospital time series data - such as vital signs, lab results, and treatments in critical care - remains underexplored. Existing datasets are relatively smal...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Burger, Manuel, Sergeev, Fedor, Londschien, Malte, Chopard, Daphné, Yèche, Hugo, Gerdes, Eike, Leshetkina, Polina, Morgenroth, Alexander, Babür, Zeynep, Bogojeska, Jasmina, Faltys, Martin, Kuznetsova, Rita, Rätsch, Gunnar
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page
container_issue
container_start_page
container_title
container_volume
creator Burger, Manuel
Sergeev, Fedor
Londschien, Malte
Chopard, Daphné
Yèche, Hugo
Gerdes, Eike
Leshetkina, Polina
Morgenroth, Alexander
Babür, Zeynep
Bogojeska, Jasmina
Faltys, Martin
Kuznetsova, Rita
Rätsch, Gunnar
description Notable progress has been made in generalist medical large language models across various healthcare areas. However, large-scale modeling of in-hospital time series data - such as vital signs, lab results, and treatments in critical care - remains underexplored. Existing datasets are relatively small, but combining them can enhance patient diversity and improve model robustness. To effectively utilize these combined datasets for large-scale modeling, it is essential to address the distribution shifts caused by varying treatment policies, necessitating the harmonization of treatment variables across the different datasets. This work aims to establish a foundation for training large-scale multi-variate time series models on critical care data and to provide a benchmark for machine learning models in transfer learning across hospitals to study and address distribution shift challenges. We introduce a harmonized dataset for sequence modeling and transfer learning research, representing the first large-scale collection to include core treatment variables. Future plans involve expanding this dataset to support further advancements in transfer learning and the development of scalable, generalizable models for critical healthcare applications.
doi_str_mv 10.48550/arxiv.2411.16346
format Article
fullrecord <record><control><sourceid>arxiv_GOX</sourceid><recordid>TN_cdi_arxiv_primary_2411_16346</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2411_16346</sourcerecordid><originalsourceid>FETCH-arxiv_primary_2411_163463</originalsourceid><addsrcrecordid>eNpjYJA0NNAzsTA1NdBPLKrILNMzMjE01DM0MzYx42QwD8kvTyxKKVZwyy_NS0ksyczPU_DNT0nNKVZIyy9ScC7KLMlMTsxRcE4sSlUIycxNVQhOLcpMLeZhYE1LzClO5YXS3Azybq4hzh66YCviC4oycxOLKuNBVsWDrTImrAIArsAy5Q</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>Towards Foundation Models for Critical Care Time Series</title><source>arXiv.org</source><creator>Burger, Manuel ; Sergeev, Fedor ; Londschien, Malte ; Chopard, Daphné ; Yèche, Hugo ; Gerdes, Eike ; Leshetkina, Polina ; Morgenroth, Alexander ; Babür, Zeynep ; Bogojeska, Jasmina ; Faltys, Martin ; Kuznetsova, Rita ; Rätsch, Gunnar</creator><creatorcontrib>Burger, Manuel ; Sergeev, Fedor ; Londschien, Malte ; Chopard, Daphné ; Yèche, Hugo ; Gerdes, Eike ; Leshetkina, Polina ; Morgenroth, Alexander ; Babür, Zeynep ; Bogojeska, Jasmina ; Faltys, Martin ; Kuznetsova, Rita ; Rätsch, Gunnar</creatorcontrib><description>Notable progress has been made in generalist medical large language models across various healthcare areas. However, large-scale modeling of in-hospital time series data - such as vital signs, lab results, and treatments in critical care - remains underexplored. Existing datasets are relatively small, but combining them can enhance patient diversity and improve model robustness. To effectively utilize these combined datasets for large-scale modeling, it is essential to address the distribution shifts caused by varying treatment policies, necessitating the harmonization of treatment variables across the different datasets. This work aims to establish a foundation for training large-scale multi-variate time series models on critical care data and to provide a benchmark for machine learning models in transfer learning across hospitals to study and address distribution shift challenges. We introduce a harmonized dataset for sequence modeling and transfer learning research, representing the first large-scale collection to include core treatment variables. Future plans involve expanding this dataset to support further advancements in transfer learning and the development of scalable, generalizable models for critical healthcare applications.</description><identifier>DOI: 10.48550/arxiv.2411.16346</identifier><language>eng</language><subject>Computer Science - Learning ; Statistics - Machine Learning</subject><creationdate>2024-11</creationdate><rights>http://arxiv.org/licenses/nonexclusive-distrib/1.0</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>228,230,776,881</link.rule.ids><linktorsrc>$$Uhttps://arxiv.org/abs/2411.16346$$EView_record_in_Cornell_University$$FView_record_in_$$GCornell_University$$Hfree_for_read</linktorsrc><backlink>$$Uhttps://doi.org/10.48550/arXiv.2411.16346$$DView paper in arXiv$$Hfree_for_read</backlink></links><search><creatorcontrib>Burger, Manuel</creatorcontrib><creatorcontrib>Sergeev, Fedor</creatorcontrib><creatorcontrib>Londschien, Malte</creatorcontrib><creatorcontrib>Chopard, Daphné</creatorcontrib><creatorcontrib>Yèche, Hugo</creatorcontrib><creatorcontrib>Gerdes, Eike</creatorcontrib><creatorcontrib>Leshetkina, Polina</creatorcontrib><creatorcontrib>Morgenroth, Alexander</creatorcontrib><creatorcontrib>Babür, Zeynep</creatorcontrib><creatorcontrib>Bogojeska, Jasmina</creatorcontrib><creatorcontrib>Faltys, Martin</creatorcontrib><creatorcontrib>Kuznetsova, Rita</creatorcontrib><creatorcontrib>Rätsch, Gunnar</creatorcontrib><title>Towards Foundation Models for Critical Care Time Series</title><description>Notable progress has been made in generalist medical large language models across various healthcare areas. However, large-scale modeling of in-hospital time series data - such as vital signs, lab results, and treatments in critical care - remains underexplored. Existing datasets are relatively small, but combining them can enhance patient diversity and improve model robustness. To effectively utilize these combined datasets for large-scale modeling, it is essential to address the distribution shifts caused by varying treatment policies, necessitating the harmonization of treatment variables across the different datasets. This work aims to establish a foundation for training large-scale multi-variate time series models on critical care data and to provide a benchmark for machine learning models in transfer learning across hospitals to study and address distribution shift challenges. We introduce a harmonized dataset for sequence modeling and transfer learning research, representing the first large-scale collection to include core treatment variables. Future plans involve expanding this dataset to support further advancements in transfer learning and the development of scalable, generalizable models for critical healthcare applications.</description><subject>Computer Science - Learning</subject><subject>Statistics - Machine Learning</subject><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><sourceid>GOX</sourceid><recordid>eNpjYJA0NNAzsTA1NdBPLKrILNMzMjE01DM0MzYx42QwD8kvTyxKKVZwyy_NS0ksyczPU_DNT0nNKVZIyy9ScC7KLMlMTsxRcE4sSlUIycxNVQhOLcpMLeZhYE1LzClO5YXS3Azybq4hzh66YCviC4oycxOLKuNBVsWDrTImrAIArsAy5Q</recordid><startdate>20241125</startdate><enddate>20241125</enddate><creator>Burger, Manuel</creator><creator>Sergeev, Fedor</creator><creator>Londschien, Malte</creator><creator>Chopard, Daphné</creator><creator>Yèche, Hugo</creator><creator>Gerdes, Eike</creator><creator>Leshetkina, Polina</creator><creator>Morgenroth, Alexander</creator><creator>Babür, Zeynep</creator><creator>Bogojeska, Jasmina</creator><creator>Faltys, Martin</creator><creator>Kuznetsova, Rita</creator><creator>Rätsch, Gunnar</creator><scope>AKY</scope><scope>EPD</scope><scope>GOX</scope></search><sort><creationdate>20241125</creationdate><title>Towards Foundation Models for Critical Care Time Series</title><author>Burger, Manuel ; Sergeev, Fedor ; Londschien, Malte ; Chopard, Daphné ; Yèche, Hugo ; Gerdes, Eike ; Leshetkina, Polina ; Morgenroth, Alexander ; Babür, Zeynep ; Bogojeska, Jasmina ; Faltys, Martin ; Kuznetsova, Rita ; Rätsch, Gunnar</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-arxiv_primary_2411_163463</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Computer Science - Learning</topic><topic>Statistics - Machine Learning</topic><toplevel>online_resources</toplevel><creatorcontrib>Burger, Manuel</creatorcontrib><creatorcontrib>Sergeev, Fedor</creatorcontrib><creatorcontrib>Londschien, Malte</creatorcontrib><creatorcontrib>Chopard, Daphné</creatorcontrib><creatorcontrib>Yèche, Hugo</creatorcontrib><creatorcontrib>Gerdes, Eike</creatorcontrib><creatorcontrib>Leshetkina, Polina</creatorcontrib><creatorcontrib>Morgenroth, Alexander</creatorcontrib><creatorcontrib>Babür, Zeynep</creatorcontrib><creatorcontrib>Bogojeska, Jasmina</creatorcontrib><creatorcontrib>Faltys, Martin</creatorcontrib><creatorcontrib>Kuznetsova, Rita</creatorcontrib><creatorcontrib>Rätsch, Gunnar</creatorcontrib><collection>arXiv Computer Science</collection><collection>arXiv Statistics</collection><collection>arXiv.org</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Burger, Manuel</au><au>Sergeev, Fedor</au><au>Londschien, Malte</au><au>Chopard, Daphné</au><au>Yèche, Hugo</au><au>Gerdes, Eike</au><au>Leshetkina, Polina</au><au>Morgenroth, Alexander</au><au>Babür, Zeynep</au><au>Bogojeska, Jasmina</au><au>Faltys, Martin</au><au>Kuznetsova, Rita</au><au>Rätsch, Gunnar</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Towards Foundation Models for Critical Care Time Series</atitle><date>2024-11-25</date><risdate>2024</risdate><abstract>Notable progress has been made in generalist medical large language models across various healthcare areas. However, large-scale modeling of in-hospital time series data - such as vital signs, lab results, and treatments in critical care - remains underexplored. Existing datasets are relatively small, but combining them can enhance patient diversity and improve model robustness. To effectively utilize these combined datasets for large-scale modeling, it is essential to address the distribution shifts caused by varying treatment policies, necessitating the harmonization of treatment variables across the different datasets. This work aims to establish a foundation for training large-scale multi-variate time series models on critical care data and to provide a benchmark for machine learning models in transfer learning across hospitals to study and address distribution shift challenges. We introduce a harmonized dataset for sequence modeling and transfer learning research, representing the first large-scale collection to include core treatment variables. Future plans involve expanding this dataset to support further advancements in transfer learning and the development of scalable, generalizable models for critical healthcare applications.</abstract><doi>10.48550/arxiv.2411.16346</doi><oa>free_for_read</oa></addata></record>
fulltext fulltext_linktorsrc
identifier DOI: 10.48550/arxiv.2411.16346
ispartof
issn
language eng
recordid cdi_arxiv_primary_2411_16346
source arXiv.org
subjects Computer Science - Learning
Statistics - Machine Learning
title Towards Foundation Models for Critical Care Time Series
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-10T17%3A40%3A11IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-arxiv_GOX&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Towards%20Foundation%20Models%20for%20Critical%20Care%20Time%20Series&rft.au=Burger,%20Manuel&rft.date=2024-11-25&rft_id=info:doi/10.48550/arxiv.2411.16346&rft_dat=%3Carxiv_GOX%3E2411_16346%3C/arxiv_GOX%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true