Multi-Modal Dataset Creation for Federated Learning with DICOM Structured Reports

Purpose: Federated training is often hindered by heterogeneous datasets due to divergent data storage options, inconsistent naming schemes, varied annotation procedures, and disparities in label quality. This is particularly evident in the emerging multi-modal learning paradigms, where dataset harmo...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:arXiv.org 2024-08
Hauptverfasser: Tölle, Malte, Burger, Lukas, Kelm, Halvar, André, Florian, Bannas, Peter, Diller, Gerhard, Frey, Norbert, Garthe, Philipp, Groß, Stefan, Hennemuth, Anja, Kaderali, Lars, Krüger, Nina, Leha, Andreas, Martin, Simon, Meyer, Alexander, Nagel, Eike, Orwat, Stefan, Scherer, Clemens, Seiffert, Moritz, Jan Moritz Seliger, Simm, Stefan, Friede, Tim, Seidler, Tim, Engelhardt, Sandy
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page
container_issue
container_start_page
container_title arXiv.org
container_volume
creator Tölle, Malte
Burger, Lukas
Kelm, Halvar
André, Florian
Bannas, Peter
Diller, Gerhard
Frey, Norbert
Garthe, Philipp
Groß, Stefan
Hennemuth, Anja
Kaderali, Lars
Krüger, Nina
Leha, Andreas
Martin, Simon
Meyer, Alexander
Nagel, Eike
Orwat, Stefan
Scherer, Clemens
Seiffert, Moritz
Jan Moritz Seliger
Simm, Stefan
Friede, Tim
Seidler, Tim
Engelhardt, Sandy
description Purpose: Federated training is often hindered by heterogeneous datasets due to divergent data storage options, inconsistent naming schemes, varied annotation procedures, and disparities in label quality. This is particularly evident in the emerging multi-modal learning paradigms, where dataset harmonization including a uniform data representation and filtering options are of paramount importance. Methods: DICOM structured reports enable the standardized linkage of arbitrary information beyond the imaging domain and can be used within Python deep learning pipelines with highdicom. Building on this, we developed an open platform for data integration and interactive filtering capabilities that simplifies the process of assembling multi-modal datasets. Results: In this study, we extend our prior work by showing its applicability to more and divergent data types, as well as streamlining datasets for federated training within an established consortium of eight university hospitals in Germany. We prove its concurrent filtering ability by creating harmonized multi-modal datasets across all locations for predicting the outcome after minimally invasive heart valve replacement. The data includes DICOM data (i.e. computed tomography images, electrocardiography scans) as well as annotations (i.e. calcification segmentations, pointsets and pacemaker dependency), and metadata (i.e. prosthesis and diagnoses). Conclusion: Structured reports bridge the traditional gap between imaging systems and information systems. Utilizing the inherent DICOM reference system arbitrary data types can be queried concurrently to create meaningful cohorts for clinical studies. The graphical interface as well as example structured report templates will be made publicly available.
format Article
fullrecord <record><control><sourceid>proquest</sourceid><recordid>TN_cdi_proquest_journals_3080874977</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>3080874977</sourcerecordid><originalsourceid>FETCH-proquest_journals_30808749773</originalsourceid><addsrcrecordid>eNqNys0KgkAUQOEhCJLyHQZaC9OMNrbWpCCJfvYy5LUUcezOHXr9XPQArc7iOzMWSKU2URpLuWChc50QQm61TBIVsEvpe2qj0tam57kh44B4hmCotQNvLPICakBDUPMTGBza4ck_Lb14fszOJb8R-gd5nPgKo0VyKzZvTO8g_HXJ1sX-nh2iEe3bg6Oqsx6HiSolUpHqeKe1-u_6AkycPsc</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>3080874977</pqid></control><display><type>article</type><title>Multi-Modal Dataset Creation for Federated Learning with DICOM Structured Reports</title><source>Free E- Journals</source><creator>Tölle, Malte ; Burger, Lukas ; Kelm, Halvar ; André, Florian ; Bannas, Peter ; Diller, Gerhard ; Frey, Norbert ; Garthe, Philipp ; Groß, Stefan ; Hennemuth, Anja ; Kaderali, Lars ; Krüger, Nina ; Leha, Andreas ; Martin, Simon ; Meyer, Alexander ; Nagel, Eike ; Orwat, Stefan ; Scherer, Clemens ; Seiffert, Moritz ; Jan Moritz Seliger ; Simm, Stefan ; Friede, Tim ; Seidler, Tim ; Engelhardt, Sandy</creator><creatorcontrib>Tölle, Malte ; Burger, Lukas ; Kelm, Halvar ; André, Florian ; Bannas, Peter ; Diller, Gerhard ; Frey, Norbert ; Garthe, Philipp ; Groß, Stefan ; Hennemuth, Anja ; Kaderali, Lars ; Krüger, Nina ; Leha, Andreas ; Martin, Simon ; Meyer, Alexander ; Nagel, Eike ; Orwat, Stefan ; Scherer, Clemens ; Seiffert, Moritz ; Jan Moritz Seliger ; Simm, Stefan ; Friede, Tim ; Seidler, Tim ; Engelhardt, Sandy</creatorcontrib><description>Purpose: Federated training is often hindered by heterogeneous datasets due to divergent data storage options, inconsistent naming schemes, varied annotation procedures, and disparities in label quality. This is particularly evident in the emerging multi-modal learning paradigms, where dataset harmonization including a uniform data representation and filtering options are of paramount importance. Methods: DICOM structured reports enable the standardized linkage of arbitrary information beyond the imaging domain and can be used within Python deep learning pipelines with highdicom. Building on this, we developed an open platform for data integration and interactive filtering capabilities that simplifies the process of assembling multi-modal datasets. Results: In this study, we extend our prior work by showing its applicability to more and divergent data types, as well as streamlining datasets for federated training within an established consortium of eight university hospitals in Germany. We prove its concurrent filtering ability by creating harmonized multi-modal datasets across all locations for predicting the outcome after minimally invasive heart valve replacement. The data includes DICOM data (i.e. computed tomography images, electrocardiography scans) as well as annotations (i.e. calcification segmentations, pointsets and pacemaker dependency), and metadata (i.e. prosthesis and diagnoses). Conclusion: Structured reports bridge the traditional gap between imaging systems and information systems. Utilizing the inherent DICOM reference system arbitrary data types can be queried concurrently to create meaningful cohorts for clinical studies. The graphical interface as well as example structured report templates will be made publicly available.</description><identifier>EISSN: 2331-8422</identifier><language>eng</language><publisher>Ithaca: Cornell University Library, arXiv.org</publisher><subject>Annotations ; Calcification ; Computed tomography ; Data integration ; Data storage ; Datasets ; Deep learning ; Electrocardiography ; Filtration ; Heart valves ; Information systems ; Medical imaging ; Prostheses ; Reference systems</subject><ispartof>arXiv.org, 2024-08</ispartof><rights>2024. This work is published under http://creativecommons.org/licenses/by-nc-nd/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>780,784</link.rule.ids></links><search><creatorcontrib>Tölle, Malte</creatorcontrib><creatorcontrib>Burger, Lukas</creatorcontrib><creatorcontrib>Kelm, Halvar</creatorcontrib><creatorcontrib>André, Florian</creatorcontrib><creatorcontrib>Bannas, Peter</creatorcontrib><creatorcontrib>Diller, Gerhard</creatorcontrib><creatorcontrib>Frey, Norbert</creatorcontrib><creatorcontrib>Garthe, Philipp</creatorcontrib><creatorcontrib>Groß, Stefan</creatorcontrib><creatorcontrib>Hennemuth, Anja</creatorcontrib><creatorcontrib>Kaderali, Lars</creatorcontrib><creatorcontrib>Krüger, Nina</creatorcontrib><creatorcontrib>Leha, Andreas</creatorcontrib><creatorcontrib>Martin, Simon</creatorcontrib><creatorcontrib>Meyer, Alexander</creatorcontrib><creatorcontrib>Nagel, Eike</creatorcontrib><creatorcontrib>Orwat, Stefan</creatorcontrib><creatorcontrib>Scherer, Clemens</creatorcontrib><creatorcontrib>Seiffert, Moritz</creatorcontrib><creatorcontrib>Jan Moritz Seliger</creatorcontrib><creatorcontrib>Simm, Stefan</creatorcontrib><creatorcontrib>Friede, Tim</creatorcontrib><creatorcontrib>Seidler, Tim</creatorcontrib><creatorcontrib>Engelhardt, Sandy</creatorcontrib><title>Multi-Modal Dataset Creation for Federated Learning with DICOM Structured Reports</title><title>arXiv.org</title><description>Purpose: Federated training is often hindered by heterogeneous datasets due to divergent data storage options, inconsistent naming schemes, varied annotation procedures, and disparities in label quality. This is particularly evident in the emerging multi-modal learning paradigms, where dataset harmonization including a uniform data representation and filtering options are of paramount importance. Methods: DICOM structured reports enable the standardized linkage of arbitrary information beyond the imaging domain and can be used within Python deep learning pipelines with highdicom. Building on this, we developed an open platform for data integration and interactive filtering capabilities that simplifies the process of assembling multi-modal datasets. Results: In this study, we extend our prior work by showing its applicability to more and divergent data types, as well as streamlining datasets for federated training within an established consortium of eight university hospitals in Germany. We prove its concurrent filtering ability by creating harmonized multi-modal datasets across all locations for predicting the outcome after minimally invasive heart valve replacement. The data includes DICOM data (i.e. computed tomography images, electrocardiography scans) as well as annotations (i.e. calcification segmentations, pointsets and pacemaker dependency), and metadata (i.e. prosthesis and diagnoses). Conclusion: Structured reports bridge the traditional gap between imaging systems and information systems. Utilizing the inherent DICOM reference system arbitrary data types can be queried concurrently to create meaningful cohorts for clinical studies. The graphical interface as well as example structured report templates will be made publicly available.</description><subject>Annotations</subject><subject>Calcification</subject><subject>Computed tomography</subject><subject>Data integration</subject><subject>Data storage</subject><subject>Datasets</subject><subject>Deep learning</subject><subject>Electrocardiography</subject><subject>Filtration</subject><subject>Heart valves</subject><subject>Information systems</subject><subject>Medical imaging</subject><subject>Prostheses</subject><subject>Reference systems</subject><issn>2331-8422</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><sourceid>ABUWG</sourceid><sourceid>AFKRA</sourceid><sourceid>AZQEC</sourceid><sourceid>BENPR</sourceid><sourceid>CCPQU</sourceid><sourceid>DWQXO</sourceid><recordid>eNqNys0KgkAUQOEhCJLyHQZaC9OMNrbWpCCJfvYy5LUUcezOHXr9XPQArc7iOzMWSKU2URpLuWChc50QQm61TBIVsEvpe2qj0tam57kh44B4hmCotQNvLPICakBDUPMTGBza4ck_Lb14fszOJb8R-gd5nPgKo0VyKzZvTO8g_HXJ1sX-nh2iEe3bg6Oqsx6HiSolUpHqeKe1-u_6AkycPsc</recordid><startdate>20240806</startdate><enddate>20240806</enddate><creator>Tölle, Malte</creator><creator>Burger, Lukas</creator><creator>Kelm, Halvar</creator><creator>André, Florian</creator><creator>Bannas, Peter</creator><creator>Diller, Gerhard</creator><creator>Frey, Norbert</creator><creator>Garthe, Philipp</creator><creator>Groß, Stefan</creator><creator>Hennemuth, Anja</creator><creator>Kaderali, Lars</creator><creator>Krüger, Nina</creator><creator>Leha, Andreas</creator><creator>Martin, Simon</creator><creator>Meyer, Alexander</creator><creator>Nagel, Eike</creator><creator>Orwat, Stefan</creator><creator>Scherer, Clemens</creator><creator>Seiffert, Moritz</creator><creator>Jan Moritz Seliger</creator><creator>Simm, Stefan</creator><creator>Friede, Tim</creator><creator>Seidler, Tim</creator><creator>Engelhardt, Sandy</creator><general>Cornell University Library, arXiv.org</general><scope>8FE</scope><scope>8FG</scope><scope>ABJCF</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>HCIFZ</scope><scope>L6V</scope><scope>M7S</scope><scope>PIMPY</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope><scope>PTHSS</scope></search><sort><creationdate>20240806</creationdate><title>Multi-Modal Dataset Creation for Federated Learning with DICOM Structured Reports</title><author>Tölle, Malte ; Burger, Lukas ; Kelm, Halvar ; André, Florian ; Bannas, Peter ; Diller, Gerhard ; Frey, Norbert ; Garthe, Philipp ; Groß, Stefan ; Hennemuth, Anja ; Kaderali, Lars ; Krüger, Nina ; Leha, Andreas ; Martin, Simon ; Meyer, Alexander ; Nagel, Eike ; Orwat, Stefan ; Scherer, Clemens ; Seiffert, Moritz ; Jan Moritz Seliger ; Simm, Stefan ; Friede, Tim ; Seidler, Tim ; Engelhardt, Sandy</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-proquest_journals_30808749773</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Annotations</topic><topic>Calcification</topic><topic>Computed tomography</topic><topic>Data integration</topic><topic>Data storage</topic><topic>Datasets</topic><topic>Deep learning</topic><topic>Electrocardiography</topic><topic>Filtration</topic><topic>Heart valves</topic><topic>Information systems</topic><topic>Medical imaging</topic><topic>Prostheses</topic><topic>Reference systems</topic><toplevel>online_resources</toplevel><creatorcontrib>Tölle, Malte</creatorcontrib><creatorcontrib>Burger, Lukas</creatorcontrib><creatorcontrib>Kelm, Halvar</creatorcontrib><creatorcontrib>André, Florian</creatorcontrib><creatorcontrib>Bannas, Peter</creatorcontrib><creatorcontrib>Diller, Gerhard</creatorcontrib><creatorcontrib>Frey, Norbert</creatorcontrib><creatorcontrib>Garthe, Philipp</creatorcontrib><creatorcontrib>Groß, Stefan</creatorcontrib><creatorcontrib>Hennemuth, Anja</creatorcontrib><creatorcontrib>Kaderali, Lars</creatorcontrib><creatorcontrib>Krüger, Nina</creatorcontrib><creatorcontrib>Leha, Andreas</creatorcontrib><creatorcontrib>Martin, Simon</creatorcontrib><creatorcontrib>Meyer, Alexander</creatorcontrib><creatorcontrib>Nagel, Eike</creatorcontrib><creatorcontrib>Orwat, Stefan</creatorcontrib><creatorcontrib>Scherer, Clemens</creatorcontrib><creatorcontrib>Seiffert, Moritz</creatorcontrib><creatorcontrib>Jan Moritz Seliger</creatorcontrib><creatorcontrib>Simm, Stefan</creatorcontrib><creatorcontrib>Friede, Tim</creatorcontrib><creatorcontrib>Seidler, Tim</creatorcontrib><creatorcontrib>Engelhardt, Sandy</creatorcontrib><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>Materials Science &amp; Engineering Collection</collection><collection>ProQuest Central (Alumni Edition)</collection><collection>ProQuest Central UK/Ireland</collection><collection>ProQuest Central Essentials</collection><collection>ProQuest Central</collection><collection>Technology Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central Korea</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Engineering Collection</collection><collection>Engineering Database</collection><collection>Publicly Available Content Database</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><collection>Engineering Collection</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Tölle, Malte</au><au>Burger, Lukas</au><au>Kelm, Halvar</au><au>André, Florian</au><au>Bannas, Peter</au><au>Diller, Gerhard</au><au>Frey, Norbert</au><au>Garthe, Philipp</au><au>Groß, Stefan</au><au>Hennemuth, Anja</au><au>Kaderali, Lars</au><au>Krüger, Nina</au><au>Leha, Andreas</au><au>Martin, Simon</au><au>Meyer, Alexander</au><au>Nagel, Eike</au><au>Orwat, Stefan</au><au>Scherer, Clemens</au><au>Seiffert, Moritz</au><au>Jan Moritz Seliger</au><au>Simm, Stefan</au><au>Friede, Tim</au><au>Seidler, Tim</au><au>Engelhardt, Sandy</au><format>book</format><genre>document</genre><ristype>GEN</ristype><atitle>Multi-Modal Dataset Creation for Federated Learning with DICOM Structured Reports</atitle><jtitle>arXiv.org</jtitle><date>2024-08-06</date><risdate>2024</risdate><eissn>2331-8422</eissn><abstract>Purpose: Federated training is often hindered by heterogeneous datasets due to divergent data storage options, inconsistent naming schemes, varied annotation procedures, and disparities in label quality. This is particularly evident in the emerging multi-modal learning paradigms, where dataset harmonization including a uniform data representation and filtering options are of paramount importance. Methods: DICOM structured reports enable the standardized linkage of arbitrary information beyond the imaging domain and can be used within Python deep learning pipelines with highdicom. Building on this, we developed an open platform for data integration and interactive filtering capabilities that simplifies the process of assembling multi-modal datasets. Results: In this study, we extend our prior work by showing its applicability to more and divergent data types, as well as streamlining datasets for federated training within an established consortium of eight university hospitals in Germany. We prove its concurrent filtering ability by creating harmonized multi-modal datasets across all locations for predicting the outcome after minimally invasive heart valve replacement. The data includes DICOM data (i.e. computed tomography images, electrocardiography scans) as well as annotations (i.e. calcification segmentations, pointsets and pacemaker dependency), and metadata (i.e. prosthesis and diagnoses). Conclusion: Structured reports bridge the traditional gap between imaging systems and information systems. Utilizing the inherent DICOM reference system arbitrary data types can be queried concurrently to create meaningful cohorts for clinical studies. The graphical interface as well as example structured report templates will be made publicly available.</abstract><cop>Ithaca</cop><pub>Cornell University Library, arXiv.org</pub><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier EISSN: 2331-8422
ispartof arXiv.org, 2024-08
issn 2331-8422
language eng
recordid cdi_proquest_journals_3080874977
source Free E- Journals
subjects Annotations
Calcification
Computed tomography
Data integration
Data storage
Datasets
Deep learning
Electrocardiography
Filtration
Heart valves
Information systems
Medical imaging
Prostheses
Reference systems
title Multi-Modal Dataset Creation for Federated Learning with DICOM Structured Reports
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-05T21%3A29%3A47IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=document&rft.atitle=Multi-Modal%20Dataset%20Creation%20for%20Federated%20Learning%20with%20DICOM%20Structured%20Reports&rft.jtitle=arXiv.org&rft.au=T%C3%B6lle,%20Malte&rft.date=2024-08-06&rft.eissn=2331-8422&rft_id=info:doi/&rft_dat=%3Cproquest%3E3080874977%3C/proquest%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=3080874977&rft_id=info:pmid/&rfr_iscdi=true