Label-Efficient Data Augmentation with Video Diffusion Models for Guidewire Segmentation in Cardiac Fluoroscopy

The accurate segmentation of guidewires in interventional cardiac fluoroscopy videos is crucial for computer-aided navigation tasks. Although deep learning methods have demonstrated high accuracy and robustness in wire segmentation, they require substantial annotated datasets for generalizability, u...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Pan, Shaoyan, Liu, Yikang, Zhao, Lin, Chen, Eric Z, Chen, Xiao, Chen, Terrence, Sun, Shanhui
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page
container_issue
container_start_page
container_title
container_volume
creator Pan, Shaoyan
Liu, Yikang
Zhao, Lin
Chen, Eric Z
Chen, Xiao
Chen, Terrence
Sun, Shanhui
description The accurate segmentation of guidewires in interventional cardiac fluoroscopy videos is crucial for computer-aided navigation tasks. Although deep learning methods have demonstrated high accuracy and robustness in wire segmentation, they require substantial annotated datasets for generalizability, underscoring the need for extensive labeled data to enhance model performance. To address this challenge, we propose the Segmentation-guided Frame-consistency Video Diffusion Model (SF-VD) to generate large collections of labeled fluoroscopy videos, augmenting the training data for wire segmentation networks. SF-VD leverages videos with limited annotations by independently modeling scene distribution and motion distribution. It first samples the scene distribution by generating 2D fluoroscopy images with wires positioned according to a specified input mask, and then samples the motion distribution by progressively generating subsequent frames, ensuring frame-to-frame coherence through a frame-consistency strategy. A segmentation-guided mechanism further refines the process by adjusting wire contrast, ensuring a diverse range of visibility in the synthesized image. Evaluation on a fluoroscopy dataset confirms the superior quality of the generated videos and shows significant improvements in guidewire segmentation.
doi_str_mv 10.48550/arxiv.2412.16050
format Article
fullrecord <record><control><sourceid>arxiv_GOX</sourceid><recordid>TN_cdi_arxiv_primary_2412_16050</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2412_16050</sourcerecordid><originalsourceid>FETCH-arxiv_primary_2412_160503</originalsourceid><addsrcrecordid>eNqFjrsOgjAUhrs4GPUBnDwvAAKCcTVcdNBJ40qO0OpJgJKWiry9QBzcnP78t-RjbOk6tr8LAmeN6k0v2_Ndz3a3TuBMmTzhnRdWLARlxKsGImwQ9uZR9gYbkhW01DzhRjmXEJEQRg_hWea80CCkgoPpu5YUhwv_uVEFIaqcMIOkMFJJncm6m7OJwELzxVdnbJXE1_BojWhprahE1aUDYjoibv4vPtCkSB0</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>Label-Efficient Data Augmentation with Video Diffusion Models for Guidewire Segmentation in Cardiac Fluoroscopy</title><source>arXiv.org</source><creator>Pan, Shaoyan ; Liu, Yikang ; Zhao, Lin ; Chen, Eric Z ; Chen, Xiao ; Chen, Terrence ; Sun, Shanhui</creator><creatorcontrib>Pan, Shaoyan ; Liu, Yikang ; Zhao, Lin ; Chen, Eric Z ; Chen, Xiao ; Chen, Terrence ; Sun, Shanhui</creatorcontrib><description>The accurate segmentation of guidewires in interventional cardiac fluoroscopy videos is crucial for computer-aided navigation tasks. Although deep learning methods have demonstrated high accuracy and robustness in wire segmentation, they require substantial annotated datasets for generalizability, underscoring the need for extensive labeled data to enhance model performance. To address this challenge, we propose the Segmentation-guided Frame-consistency Video Diffusion Model (SF-VD) to generate large collections of labeled fluoroscopy videos, augmenting the training data for wire segmentation networks. SF-VD leverages videos with limited annotations by independently modeling scene distribution and motion distribution. It first samples the scene distribution by generating 2D fluoroscopy images with wires positioned according to a specified input mask, and then samples the motion distribution by progressively generating subsequent frames, ensuring frame-to-frame coherence through a frame-consistency strategy. A segmentation-guided mechanism further refines the process by adjusting wire contrast, ensuring a diverse range of visibility in the synthesized image. Evaluation on a fluoroscopy dataset confirms the superior quality of the generated videos and shows significant improvements in guidewire segmentation.</description><identifier>DOI: 10.48550/arxiv.2412.16050</identifier><language>eng</language><subject>Computer Science - Artificial Intelligence ; Computer Science - Computer Vision and Pattern Recognition</subject><creationdate>2024-12</creationdate><rights>http://arxiv.org/licenses/nonexclusive-distrib/1.0</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>228,230,780,885</link.rule.ids><linktorsrc>$$Uhttps://arxiv.org/abs/2412.16050$$EView_record_in_Cornell_University$$FView_record_in_$$GCornell_University$$Hfree_for_read</linktorsrc><backlink>$$Uhttps://doi.org/10.48550/arXiv.2412.16050$$DView paper in arXiv$$Hfree_for_read</backlink></links><search><creatorcontrib>Pan, Shaoyan</creatorcontrib><creatorcontrib>Liu, Yikang</creatorcontrib><creatorcontrib>Zhao, Lin</creatorcontrib><creatorcontrib>Chen, Eric Z</creatorcontrib><creatorcontrib>Chen, Xiao</creatorcontrib><creatorcontrib>Chen, Terrence</creatorcontrib><creatorcontrib>Sun, Shanhui</creatorcontrib><title>Label-Efficient Data Augmentation with Video Diffusion Models for Guidewire Segmentation in Cardiac Fluoroscopy</title><description>The accurate segmentation of guidewires in interventional cardiac fluoroscopy videos is crucial for computer-aided navigation tasks. Although deep learning methods have demonstrated high accuracy and robustness in wire segmentation, they require substantial annotated datasets for generalizability, underscoring the need for extensive labeled data to enhance model performance. To address this challenge, we propose the Segmentation-guided Frame-consistency Video Diffusion Model (SF-VD) to generate large collections of labeled fluoroscopy videos, augmenting the training data for wire segmentation networks. SF-VD leverages videos with limited annotations by independently modeling scene distribution and motion distribution. It first samples the scene distribution by generating 2D fluoroscopy images with wires positioned according to a specified input mask, and then samples the motion distribution by progressively generating subsequent frames, ensuring frame-to-frame coherence through a frame-consistency strategy. A segmentation-guided mechanism further refines the process by adjusting wire contrast, ensuring a diverse range of visibility in the synthesized image. Evaluation on a fluoroscopy dataset confirms the superior quality of the generated videos and shows significant improvements in guidewire segmentation.</description><subject>Computer Science - Artificial Intelligence</subject><subject>Computer Science - Computer Vision and Pattern Recognition</subject><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><sourceid>GOX</sourceid><recordid>eNqFjrsOgjAUhrs4GPUBnDwvAAKCcTVcdNBJ40qO0OpJgJKWiry9QBzcnP78t-RjbOk6tr8LAmeN6k0v2_Ndz3a3TuBMmTzhnRdWLARlxKsGImwQ9uZR9gYbkhW01DzhRjmXEJEQRg_hWea80CCkgoPpu5YUhwv_uVEFIaqcMIOkMFJJncm6m7OJwELzxVdnbJXE1_BojWhprahE1aUDYjoibv4vPtCkSB0</recordid><startdate>20241220</startdate><enddate>20241220</enddate><creator>Pan, Shaoyan</creator><creator>Liu, Yikang</creator><creator>Zhao, Lin</creator><creator>Chen, Eric Z</creator><creator>Chen, Xiao</creator><creator>Chen, Terrence</creator><creator>Sun, Shanhui</creator><scope>AKY</scope><scope>GOX</scope></search><sort><creationdate>20241220</creationdate><title>Label-Efficient Data Augmentation with Video Diffusion Models for Guidewire Segmentation in Cardiac Fluoroscopy</title><author>Pan, Shaoyan ; Liu, Yikang ; Zhao, Lin ; Chen, Eric Z ; Chen, Xiao ; Chen, Terrence ; Sun, Shanhui</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-arxiv_primary_2412_160503</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Computer Science - Artificial Intelligence</topic><topic>Computer Science - Computer Vision and Pattern Recognition</topic><toplevel>online_resources</toplevel><creatorcontrib>Pan, Shaoyan</creatorcontrib><creatorcontrib>Liu, Yikang</creatorcontrib><creatorcontrib>Zhao, Lin</creatorcontrib><creatorcontrib>Chen, Eric Z</creatorcontrib><creatorcontrib>Chen, Xiao</creatorcontrib><creatorcontrib>Chen, Terrence</creatorcontrib><creatorcontrib>Sun, Shanhui</creatorcontrib><collection>arXiv Computer Science</collection><collection>arXiv.org</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Pan, Shaoyan</au><au>Liu, Yikang</au><au>Zhao, Lin</au><au>Chen, Eric Z</au><au>Chen, Xiao</au><au>Chen, Terrence</au><au>Sun, Shanhui</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Label-Efficient Data Augmentation with Video Diffusion Models for Guidewire Segmentation in Cardiac Fluoroscopy</atitle><date>2024-12-20</date><risdate>2024</risdate><abstract>The accurate segmentation of guidewires in interventional cardiac fluoroscopy videos is crucial for computer-aided navigation tasks. Although deep learning methods have demonstrated high accuracy and robustness in wire segmentation, they require substantial annotated datasets for generalizability, underscoring the need for extensive labeled data to enhance model performance. To address this challenge, we propose the Segmentation-guided Frame-consistency Video Diffusion Model (SF-VD) to generate large collections of labeled fluoroscopy videos, augmenting the training data for wire segmentation networks. SF-VD leverages videos with limited annotations by independently modeling scene distribution and motion distribution. It first samples the scene distribution by generating 2D fluoroscopy images with wires positioned according to a specified input mask, and then samples the motion distribution by progressively generating subsequent frames, ensuring frame-to-frame coherence through a frame-consistency strategy. A segmentation-guided mechanism further refines the process by adjusting wire contrast, ensuring a diverse range of visibility in the synthesized image. Evaluation on a fluoroscopy dataset confirms the superior quality of the generated videos and shows significant improvements in guidewire segmentation.</abstract><doi>10.48550/arxiv.2412.16050</doi><oa>free_for_read</oa></addata></record>
fulltext fulltext_linktorsrc
identifier DOI: 10.48550/arxiv.2412.16050
ispartof
issn
language eng
recordid cdi_arxiv_primary_2412_16050
source arXiv.org
subjects Computer Science - Artificial Intelligence
Computer Science - Computer Vision and Pattern Recognition
title Label-Efficient Data Augmentation with Video Diffusion Models for Guidewire Segmentation in Cardiac Fluoroscopy
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-05T08%3A59%3A44IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-arxiv_GOX&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Label-Efficient%20Data%20Augmentation%20with%20Video%20Diffusion%20Models%20for%20Guidewire%20Segmentation%20in%20Cardiac%20Fluoroscopy&rft.au=Pan,%20Shaoyan&rft.date=2024-12-20&rft_id=info:doi/10.48550/arxiv.2412.16050&rft_dat=%3Carxiv_GOX%3E2412_16050%3C/arxiv_GOX%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true