Multi-Modal Dataset Creation for Federated Learning with DICOM Structured Reports
Purpose: Federated training is often hindered by heterogeneous datasets due to divergent data storage options, inconsistent naming schemes, varied annotation procedures, and disparities in label quality. This is particularly evident in the emerging multi-modal learning paradigms, where dataset harmo...
Gespeichert in:
Veröffentlicht in: | arXiv.org 2024-08 |
---|---|
Hauptverfasser: | , , , , , , , , , , , , , , , , , , , , , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | |
---|---|
container_issue | |
container_start_page | |
container_title | arXiv.org |
container_volume | |
creator | Tölle, Malte Burger, Lukas Kelm, Halvar André, Florian Bannas, Peter Diller, Gerhard Frey, Norbert Garthe, Philipp Groß, Stefan Hennemuth, Anja Kaderali, Lars Krüger, Nina Leha, Andreas Martin, Simon Meyer, Alexander Nagel, Eike Orwat, Stefan Scherer, Clemens Seiffert, Moritz Jan Moritz Seliger Simm, Stefan Friede, Tim Seidler, Tim Engelhardt, Sandy |
description | Purpose: Federated training is often hindered by heterogeneous datasets due to divergent data storage options, inconsistent naming schemes, varied annotation procedures, and disparities in label quality. This is particularly evident in the emerging multi-modal learning paradigms, where dataset harmonization including a uniform data representation and filtering options are of paramount importance. Methods: DICOM structured reports enable the standardized linkage of arbitrary information beyond the imaging domain and can be used within Python deep learning pipelines with highdicom. Building on this, we developed an open platform for data integration and interactive filtering capabilities that simplifies the process of assembling multi-modal datasets. Results: In this study, we extend our prior work by showing its applicability to more and divergent data types, as well as streamlining datasets for federated training within an established consortium of eight university hospitals in Germany. We prove its concurrent filtering ability by creating harmonized multi-modal datasets across all locations for predicting the outcome after minimally invasive heart valve replacement. The data includes DICOM data (i.e. computed tomography images, electrocardiography scans) as well as annotations (i.e. calcification segmentations, pointsets and pacemaker dependency), and metadata (i.e. prosthesis and diagnoses). Conclusion: Structured reports bridge the traditional gap between imaging systems and information systems. Utilizing the inherent DICOM reference system arbitrary data types can be queried concurrently to create meaningful cohorts for clinical studies. The graphical interface as well as example structured report templates will be made publicly available. |
format | Article |
fullrecord | <record><control><sourceid>proquest</sourceid><recordid>TN_cdi_proquest_journals_3080874977</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>3080874977</sourcerecordid><originalsourceid>FETCH-proquest_journals_30808749773</originalsourceid><addsrcrecordid>eNqNys0KgkAUQOEhCJLyHQZaC9OMNrbWpCCJfvYy5LUUcezOHXr9XPQArc7iOzMWSKU2URpLuWChc50QQm61TBIVsEvpe2qj0tam57kh44B4hmCotQNvLPICakBDUPMTGBza4ck_Lb14fszOJb8R-gd5nPgKo0VyKzZvTO8g_HXJ1sX-nh2iEe3bg6Oqsx6HiSolUpHqeKe1-u_6AkycPsc</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>3080874977</pqid></control><display><type>article</type><title>Multi-Modal Dataset Creation for Federated Learning with DICOM Structured Reports</title><source>Free E- Journals</source><creator>Tölle, Malte ; Burger, Lukas ; Kelm, Halvar ; André, Florian ; Bannas, Peter ; Diller, Gerhard ; Frey, Norbert ; Garthe, Philipp ; Groß, Stefan ; Hennemuth, Anja ; Kaderali, Lars ; Krüger, Nina ; Leha, Andreas ; Martin, Simon ; Meyer, Alexander ; Nagel, Eike ; Orwat, Stefan ; Scherer, Clemens ; Seiffert, Moritz ; Jan Moritz Seliger ; Simm, Stefan ; Friede, Tim ; Seidler, Tim ; Engelhardt, Sandy</creator><creatorcontrib>Tölle, Malte ; Burger, Lukas ; Kelm, Halvar ; André, Florian ; Bannas, Peter ; Diller, Gerhard ; Frey, Norbert ; Garthe, Philipp ; Groß, Stefan ; Hennemuth, Anja ; Kaderali, Lars ; Krüger, Nina ; Leha, Andreas ; Martin, Simon ; Meyer, Alexander ; Nagel, Eike ; Orwat, Stefan ; Scherer, Clemens ; Seiffert, Moritz ; Jan Moritz Seliger ; Simm, Stefan ; Friede, Tim ; Seidler, Tim ; Engelhardt, Sandy</creatorcontrib><description>Purpose: Federated training is often hindered by heterogeneous datasets due to divergent data storage options, inconsistent naming schemes, varied annotation procedures, and disparities in label quality. This is particularly evident in the emerging multi-modal learning paradigms, where dataset harmonization including a uniform data representation and filtering options are of paramount importance. Methods: DICOM structured reports enable the standardized linkage of arbitrary information beyond the imaging domain and can be used within Python deep learning pipelines with highdicom. Building on this, we developed an open platform for data integration and interactive filtering capabilities that simplifies the process of assembling multi-modal datasets. Results: In this study, we extend our prior work by showing its applicability to more and divergent data types, as well as streamlining datasets for federated training within an established consortium of eight university hospitals in Germany. We prove its concurrent filtering ability by creating harmonized multi-modal datasets across all locations for predicting the outcome after minimally invasive heart valve replacement. The data includes DICOM data (i.e. computed tomography images, electrocardiography scans) as well as annotations (i.e. calcification segmentations, pointsets and pacemaker dependency), and metadata (i.e. prosthesis and diagnoses). Conclusion: Structured reports bridge the traditional gap between imaging systems and information systems. Utilizing the inherent DICOM reference system arbitrary data types can be queried concurrently to create meaningful cohorts for clinical studies. The graphical interface as well as example structured report templates will be made publicly available.</description><identifier>EISSN: 2331-8422</identifier><language>eng</language><publisher>Ithaca: Cornell University Library, arXiv.org</publisher><subject>Annotations ; Calcification ; Computed tomography ; Data integration ; Data storage ; Datasets ; Deep learning ; Electrocardiography ; Filtration ; Heart valves ; Information systems ; Medical imaging ; Prostheses ; Reference systems</subject><ispartof>arXiv.org, 2024-08</ispartof><rights>2024. This work is published under http://creativecommons.org/licenses/by-nc-nd/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>780,784</link.rule.ids></links><search><creatorcontrib>Tölle, Malte</creatorcontrib><creatorcontrib>Burger, Lukas</creatorcontrib><creatorcontrib>Kelm, Halvar</creatorcontrib><creatorcontrib>André, Florian</creatorcontrib><creatorcontrib>Bannas, Peter</creatorcontrib><creatorcontrib>Diller, Gerhard</creatorcontrib><creatorcontrib>Frey, Norbert</creatorcontrib><creatorcontrib>Garthe, Philipp</creatorcontrib><creatorcontrib>Groß, Stefan</creatorcontrib><creatorcontrib>Hennemuth, Anja</creatorcontrib><creatorcontrib>Kaderali, Lars</creatorcontrib><creatorcontrib>Krüger, Nina</creatorcontrib><creatorcontrib>Leha, Andreas</creatorcontrib><creatorcontrib>Martin, Simon</creatorcontrib><creatorcontrib>Meyer, Alexander</creatorcontrib><creatorcontrib>Nagel, Eike</creatorcontrib><creatorcontrib>Orwat, Stefan</creatorcontrib><creatorcontrib>Scherer, Clemens</creatorcontrib><creatorcontrib>Seiffert, Moritz</creatorcontrib><creatorcontrib>Jan Moritz Seliger</creatorcontrib><creatorcontrib>Simm, Stefan</creatorcontrib><creatorcontrib>Friede, Tim</creatorcontrib><creatorcontrib>Seidler, Tim</creatorcontrib><creatorcontrib>Engelhardt, Sandy</creatorcontrib><title>Multi-Modal Dataset Creation for Federated Learning with DICOM Structured Reports</title><title>arXiv.org</title><description>Purpose: Federated training is often hindered by heterogeneous datasets due to divergent data storage options, inconsistent naming schemes, varied annotation procedures, and disparities in label quality. This is particularly evident in the emerging multi-modal learning paradigms, where dataset harmonization including a uniform data representation and filtering options are of paramount importance. Methods: DICOM structured reports enable the standardized linkage of arbitrary information beyond the imaging domain and can be used within Python deep learning pipelines with highdicom. Building on this, we developed an open platform for data integration and interactive filtering capabilities that simplifies the process of assembling multi-modal datasets. Results: In this study, we extend our prior work by showing its applicability to more and divergent data types, as well as streamlining datasets for federated training within an established consortium of eight university hospitals in Germany. We prove its concurrent filtering ability by creating harmonized multi-modal datasets across all locations for predicting the outcome after minimally invasive heart valve replacement. The data includes DICOM data (i.e. computed tomography images, electrocardiography scans) as well as annotations (i.e. calcification segmentations, pointsets and pacemaker dependency), and metadata (i.e. prosthesis and diagnoses). Conclusion: Structured reports bridge the traditional gap between imaging systems and information systems. Utilizing the inherent DICOM reference system arbitrary data types can be queried concurrently to create meaningful cohorts for clinical studies. The graphical interface as well as example structured report templates will be made publicly available.</description><subject>Annotations</subject><subject>Calcification</subject><subject>Computed tomography</subject><subject>Data integration</subject><subject>Data storage</subject><subject>Datasets</subject><subject>Deep learning</subject><subject>Electrocardiography</subject><subject>Filtration</subject><subject>Heart valves</subject><subject>Information systems</subject><subject>Medical imaging</subject><subject>Prostheses</subject><subject>Reference systems</subject><issn>2331-8422</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><sourceid>ABUWG</sourceid><sourceid>AFKRA</sourceid><sourceid>AZQEC</sourceid><sourceid>BENPR</sourceid><sourceid>CCPQU</sourceid><sourceid>DWQXO</sourceid><recordid>eNqNys0KgkAUQOEhCJLyHQZaC9OMNrbWpCCJfvYy5LUUcezOHXr9XPQArc7iOzMWSKU2URpLuWChc50QQm61TBIVsEvpe2qj0tam57kh44B4hmCotQNvLPICakBDUPMTGBza4ck_Lb14fszOJb8R-gd5nPgKo0VyKzZvTO8g_HXJ1sX-nh2iEe3bg6Oqsx6HiSolUpHqeKe1-u_6AkycPsc</recordid><startdate>20240806</startdate><enddate>20240806</enddate><creator>Tölle, Malte</creator><creator>Burger, Lukas</creator><creator>Kelm, Halvar</creator><creator>André, Florian</creator><creator>Bannas, Peter</creator><creator>Diller, Gerhard</creator><creator>Frey, Norbert</creator><creator>Garthe, Philipp</creator><creator>Groß, Stefan</creator><creator>Hennemuth, Anja</creator><creator>Kaderali, Lars</creator><creator>Krüger, Nina</creator><creator>Leha, Andreas</creator><creator>Martin, Simon</creator><creator>Meyer, Alexander</creator><creator>Nagel, Eike</creator><creator>Orwat, Stefan</creator><creator>Scherer, Clemens</creator><creator>Seiffert, Moritz</creator><creator>Jan Moritz Seliger</creator><creator>Simm, Stefan</creator><creator>Friede, Tim</creator><creator>Seidler, Tim</creator><creator>Engelhardt, Sandy</creator><general>Cornell University Library, arXiv.org</general><scope>8FE</scope><scope>8FG</scope><scope>ABJCF</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>HCIFZ</scope><scope>L6V</scope><scope>M7S</scope><scope>PIMPY</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope><scope>PTHSS</scope></search><sort><creationdate>20240806</creationdate><title>Multi-Modal Dataset Creation for Federated Learning with DICOM Structured Reports</title><author>Tölle, Malte ; Burger, Lukas ; Kelm, Halvar ; André, Florian ; Bannas, Peter ; Diller, Gerhard ; Frey, Norbert ; Garthe, Philipp ; Groß, Stefan ; Hennemuth, Anja ; Kaderali, Lars ; Krüger, Nina ; Leha, Andreas ; Martin, Simon ; Meyer, Alexander ; Nagel, Eike ; Orwat, Stefan ; Scherer, Clemens ; Seiffert, Moritz ; Jan Moritz Seliger ; Simm, Stefan ; Friede, Tim ; Seidler, Tim ; Engelhardt, Sandy</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-proquest_journals_30808749773</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Annotations</topic><topic>Calcification</topic><topic>Computed tomography</topic><topic>Data integration</topic><topic>Data storage</topic><topic>Datasets</topic><topic>Deep learning</topic><topic>Electrocardiography</topic><topic>Filtration</topic><topic>Heart valves</topic><topic>Information systems</topic><topic>Medical imaging</topic><topic>Prostheses</topic><topic>Reference systems</topic><toplevel>online_resources</toplevel><creatorcontrib>Tölle, Malte</creatorcontrib><creatorcontrib>Burger, Lukas</creatorcontrib><creatorcontrib>Kelm, Halvar</creatorcontrib><creatorcontrib>André, Florian</creatorcontrib><creatorcontrib>Bannas, Peter</creatorcontrib><creatorcontrib>Diller, Gerhard</creatorcontrib><creatorcontrib>Frey, Norbert</creatorcontrib><creatorcontrib>Garthe, Philipp</creatorcontrib><creatorcontrib>Groß, Stefan</creatorcontrib><creatorcontrib>Hennemuth, Anja</creatorcontrib><creatorcontrib>Kaderali, Lars</creatorcontrib><creatorcontrib>Krüger, Nina</creatorcontrib><creatorcontrib>Leha, Andreas</creatorcontrib><creatorcontrib>Martin, Simon</creatorcontrib><creatorcontrib>Meyer, Alexander</creatorcontrib><creatorcontrib>Nagel, Eike</creatorcontrib><creatorcontrib>Orwat, Stefan</creatorcontrib><creatorcontrib>Scherer, Clemens</creatorcontrib><creatorcontrib>Seiffert, Moritz</creatorcontrib><creatorcontrib>Jan Moritz Seliger</creatorcontrib><creatorcontrib>Simm, Stefan</creatorcontrib><creatorcontrib>Friede, Tim</creatorcontrib><creatorcontrib>Seidler, Tim</creatorcontrib><creatorcontrib>Engelhardt, Sandy</creatorcontrib><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>Materials Science & Engineering Collection</collection><collection>ProQuest Central (Alumni Edition)</collection><collection>ProQuest Central UK/Ireland</collection><collection>ProQuest Central Essentials</collection><collection>ProQuest Central</collection><collection>Technology Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central Korea</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Engineering Collection</collection><collection>Engineering Database</collection><collection>Publicly Available Content Database</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><collection>Engineering Collection</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Tölle, Malte</au><au>Burger, Lukas</au><au>Kelm, Halvar</au><au>André, Florian</au><au>Bannas, Peter</au><au>Diller, Gerhard</au><au>Frey, Norbert</au><au>Garthe, Philipp</au><au>Groß, Stefan</au><au>Hennemuth, Anja</au><au>Kaderali, Lars</au><au>Krüger, Nina</au><au>Leha, Andreas</au><au>Martin, Simon</au><au>Meyer, Alexander</au><au>Nagel, Eike</au><au>Orwat, Stefan</au><au>Scherer, Clemens</au><au>Seiffert, Moritz</au><au>Jan Moritz Seliger</au><au>Simm, Stefan</au><au>Friede, Tim</au><au>Seidler, Tim</au><au>Engelhardt, Sandy</au><format>book</format><genre>document</genre><ristype>GEN</ristype><atitle>Multi-Modal Dataset Creation for Federated Learning with DICOM Structured Reports</atitle><jtitle>arXiv.org</jtitle><date>2024-08-06</date><risdate>2024</risdate><eissn>2331-8422</eissn><abstract>Purpose: Federated training is often hindered by heterogeneous datasets due to divergent data storage options, inconsistent naming schemes, varied annotation procedures, and disparities in label quality. This is particularly evident in the emerging multi-modal learning paradigms, where dataset harmonization including a uniform data representation and filtering options are of paramount importance. Methods: DICOM structured reports enable the standardized linkage of arbitrary information beyond the imaging domain and can be used within Python deep learning pipelines with highdicom. Building on this, we developed an open platform for data integration and interactive filtering capabilities that simplifies the process of assembling multi-modal datasets. Results: In this study, we extend our prior work by showing its applicability to more and divergent data types, as well as streamlining datasets for federated training within an established consortium of eight university hospitals in Germany. We prove its concurrent filtering ability by creating harmonized multi-modal datasets across all locations for predicting the outcome after minimally invasive heart valve replacement. The data includes DICOM data (i.e. computed tomography images, electrocardiography scans) as well as annotations (i.e. calcification segmentations, pointsets and pacemaker dependency), and metadata (i.e. prosthesis and diagnoses). Conclusion: Structured reports bridge the traditional gap between imaging systems and information systems. Utilizing the inherent DICOM reference system arbitrary data types can be queried concurrently to create meaningful cohorts for clinical studies. The graphical interface as well as example structured report templates will be made publicly available.</abstract><cop>Ithaca</cop><pub>Cornell University Library, arXiv.org</pub><oa>free_for_read</oa></addata></record> |
fulltext | fulltext |
identifier | EISSN: 2331-8422 |
ispartof | arXiv.org, 2024-08 |
issn | 2331-8422 |
language | eng |
recordid | cdi_proquest_journals_3080874977 |
source | Free E- Journals |
subjects | Annotations Calcification Computed tomography Data integration Data storage Datasets Deep learning Electrocardiography Filtration Heart valves Information systems Medical imaging Prostheses Reference systems |
title | Multi-Modal Dataset Creation for Federated Learning with DICOM Structured Reports |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-05T21%3A29%3A47IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=document&rft.atitle=Multi-Modal%20Dataset%20Creation%20for%20Federated%20Learning%20with%20DICOM%20Structured%20Reports&rft.jtitle=arXiv.org&rft.au=T%C3%B6lle,%20Malte&rft.date=2024-08-06&rft.eissn=2331-8422&rft_id=info:doi/&rft_dat=%3Cproquest%3E3080874977%3C/proquest%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=3080874977&rft_id=info:pmid/&rfr_iscdi=true |