Label-Efficient Data Augmentation with Video Diffusion Models for Guidewire Segmentation in Cardiac Fluoroscopy

The accurate segmentation of guidewires in interventional cardiac fluoroscopy videos is crucial for computer-aided navigation tasks. Although deep learning methods have demonstrated high accuracy and robustness in wire segmentation, they require substantial annotated datasets for generalizability, u...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Hauptverfasser:	Pan, Shaoyan, Liu, Yikang, Zhao, Lin, Chen, Eric Z, Chen, Xiao, Chen, Terrence, Sun, Shanhui
Format:	Artikel
Sprache:	eng
Schlagworte:	Computer Science - Artificial Intelligence Computer Science - Computer Vision and Pattern Recognition
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page
container_issue
container_start_page
container_title
container_volume
creator	Pan, Shaoyan Liu, Yikang Zhao, Lin Chen, Eric Z Chen, Xiao Chen, Terrence Sun, Shanhui
description	The accurate segmentation of guidewires in interventional cardiac fluoroscopy videos is crucial for computer-aided navigation tasks. Although deep learning methods have demonstrated high accuracy and robustness in wire segmentation, they require substantial annotated datasets for generalizability, underscoring the need for extensive labeled data to enhance model performance. To address this challenge, we propose the Segmentation-guided Frame-consistency Video Diffusion Model (SF-VD) to generate large collections of labeled fluoroscopy videos, augmenting the training data for wire segmentation networks. SF-VD leverages videos with limited annotations by independently modeling scene distribution and motion distribution. It first samples the scene distribution by generating 2D fluoroscopy images with wires positioned according to a specified input mask, and then samples the motion distribution by progressively generating subsequent frames, ensuring frame-to-frame coherence through a frame-consistency strategy. A segmentation-guided mechanism further refines the process by adjusting wire contrast, ensuring a diverse range of visibility in the synthesized image. Evaluation on a fluoroscopy dataset confirms the superior quality of the generated videos and shows significant improvements in guidewire segmentation.
doi_str_mv	10.48550/arxiv.2412.16050
format	Article
fullrecord	<record><control><sourceid>arxiv_GOX</sourceid><recordid>TN_cdi_arxiv_primary_2412_16050</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2412_16050</sourcerecordid><originalsourceid>FETCH-arxiv_primary_2412_160503</originalsourceid><addsrcrecordid>eNqFjrsOgjAUhrs4GPUBnDwvAAKCcTVcdNBJ40qO0OpJgJKWiry9QBzcnP78t-RjbOk6tr8LAmeN6k0v2_Ndz3a3TuBMmTzhnRdWLARlxKsGImwQ9uZR9gYbkhW01DzhRjmXEJEQRg_hWea80CCkgoPpu5YUhwv_uVEFIaqcMIOkMFJJncm6m7OJwELzxVdnbJXE1_BojWhprahE1aUDYjoibv4vPtCkSB0</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>Label-Efficient Data Augmentation with Video Diffusion Models for Guidewire Segmentation in Cardiac Fluoroscopy</title><source>arXiv.org</source><creator>Pan, Shaoyan ; Liu, Yikang ; Zhao, Lin ; Chen, Eric Z ; Chen, Xiao ; Chen, Terrence ; Sun, Shanhui</creator><creatorcontrib>Pan, Shaoyan ; Liu, Yikang ; Zhao, Lin ; Chen, Eric Z ; Chen, Xiao ; Chen, Terrence ; Sun, Shanhui</creatorcontrib><description>The accurate segmentation of guidewires in interventional cardiac fluoroscopy videos is crucial for computer-aided navigation tasks. Although deep learning methods have demonstrated high accuracy and robustness in wire segmentation, they require substantial annotated datasets for generalizability, underscoring the need for extensive labeled data to enhance model performance. To address this challenge, we propose the Segmentation-guided Frame-consistency Video Diffusion Model (SF-VD) to generate large collections of labeled fluoroscopy videos, augmenting the training data for wire segmentation networks. SF-VD leverages videos with limited annotations by independently modeling scene distribution and motion distribution. It first samples the scene distribution by generating 2D fluoroscopy images with wires positioned according to a specified input mask, and then samples the motion distribution by progressively generating subsequent frames, ensuring frame-to-frame coherence through a frame-consistency strategy. A segmentation-guided mechanism further refines the process by adjusting wire contrast, ensuring a diverse range of visibility in the synthesized image. Evaluation on a fluoroscopy dataset confirms the superior quality of the generated videos and shows significant improvements in guidewire segmentation.</description><identifier>DOI: 10.48550/arxiv.2412.16050</identifier><language>eng</language><subject>Computer Science - Artificial Intelligence ; Computer Science - Computer Vision and Pattern Recognition</subject><creationdate>2024-12</creationdate><rights>http://arxiv.org/licenses/nonexclusive-distrib/1.0</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>228,230,780,885</link.rule.ids><linktorsrc>$$Uhttps://arxiv.org/abs/2412.16050$$EView_record_in_Cornell_University$$FView_record_in_$$GCornell_University$$Hfree_for_read</linktorsrc><backlink>$$Uhttps://doi.org/10.48550/arXiv.2412.16050$$DView paper in arXiv$$Hfree_for_read</backlink></links><search><creatorcontrib>Pan, Shaoyan</creatorcontrib><creatorcontrib>Liu, Yikang</creatorcontrib><creatorcontrib>Zhao, Lin</creatorcontrib><creatorcontrib>Chen, Eric Z</creatorcontrib><creatorcontrib>Chen, Xiao</creatorcontrib><creatorcontrib>Chen, Terrence</creatorcontrib><creatorcontrib>Sun, Shanhui</creatorcontrib><title>Label-Efficient Data Augmentation with Video Diffusion Models for Guidewire Segmentation in Cardiac Fluoroscopy</title><description>The accurate segmentation of guidewires in interventional cardiac fluoroscopy videos is crucial for computer-aided navigation tasks. Although deep learning methods have demonstrated high accuracy and robustness in wire segmentation, they require substantial annotated datasets for generalizability, underscoring the need for extensive labeled data to enhance model performance. To address this challenge, we propose the Segmentation-guided Frame-consistency Video Diffusion Model (SF-VD) to generate large collections of labeled fluoroscopy videos, augmenting the training data for wire segmentation networks. SF-VD leverages videos with limited annotations by independently modeling scene distribution and motion distribution. It first samples the scene distribution by generating 2D fluoroscopy images with wires positioned according to a specified input mask, and then samples the motion distribution by progressively generating subsequent frames, ensuring frame-to-frame coherence through a frame-consistency strategy. A segmentation-guided mechanism further refines the process by adjusting wire contrast, ensuring a diverse range of visibility in the synthesized image. Evaluation on a fluoroscopy dataset confirms the superior quality of the generated videos and shows significant improvements in guidewire segmentation.</description><subject>Computer Science - Artificial Intelligence</subject><subject>Computer Science - Computer Vision and Pattern Recognition</subject><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><sourceid>GOX</sourceid><recordid>eNqFjrsOgjAUhrs4GPUBnDwvAAKCcTVcdNBJ40qO0OpJgJKWiry9QBzcnP78t-RjbOk6tr8LAmeN6k0v2_Ndz3a3TuBMmTzhnRdWLARlxKsGImwQ9uZR9gYbkhW01DzhRjmXEJEQRg_hWea80CCkgoPpu5YUhwv_uVEFIaqcMIOkMFJJncm6m7OJwELzxVdnbJXE1_BojWhprahE1aUDYjoibv4vPtCkSB0</recordid><startdate>20241220</startdate><enddate>20241220</enddate><creator>Pan, Shaoyan</creator><creator>Liu, Yikang</creator><creator>Zhao, Lin</creator><creator>Chen, Eric Z</creator><creator>Chen, Xiao</creator><creator>Chen, Terrence</creator><creator>Sun, Shanhui</creator><scope>AKY</scope><scope>GOX</scope></search><sort><creationdate>20241220</creationdate><title>Label-Efficient Data Augmentation with Video Diffusion Models for Guidewire Segmentation in Cardiac Fluoroscopy</title><author>Pan, Shaoyan ; Liu, Yikang ; Zhao, Lin ; Chen, Eric Z ; Chen, Xiao ; Chen, Terrence ; Sun, Shanhui</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-arxiv_primary_2412_160503</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Computer Science - Artificial Intelligence</topic><topic>Computer Science - Computer Vision and Pattern Recognition</topic><toplevel>online_resources</toplevel><creatorcontrib>Pan, Shaoyan</creatorcontrib><creatorcontrib>Liu, Yikang</creatorcontrib><creatorcontrib>Zhao, Lin</creatorcontrib><creatorcontrib>Chen, Eric Z</creatorcontrib><creatorcontrib>Chen, Xiao</creatorcontrib><creatorcontrib>Chen, Terrence</creatorcontrib><creatorcontrib>Sun, Shanhui</creatorcontrib><collection>arXiv Computer Science</collection><collection>arXiv.org</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Pan, Shaoyan</au><au>Liu, Yikang</au><au>Zhao, Lin</au><au>Chen, Eric Z</au><au>Chen, Xiao</au><au>Chen, Terrence</au><au>Sun, Shanhui</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Label-Efficient Data Augmentation with Video Diffusion Models for Guidewire Segmentation in Cardiac Fluoroscopy</atitle><date>2024-12-20</date><risdate>2024</risdate><abstract>The accurate segmentation of guidewires in interventional cardiac fluoroscopy videos is crucial for computer-aided navigation tasks. Although deep learning methods have demonstrated high accuracy and robustness in wire segmentation, they require substantial annotated datasets for generalizability, underscoring the need for extensive labeled data to enhance model performance. To address this challenge, we propose the Segmentation-guided Frame-consistency Video Diffusion Model (SF-VD) to generate large collections of labeled fluoroscopy videos, augmenting the training data for wire segmentation networks. SF-VD leverages videos with limited annotations by independently modeling scene distribution and motion distribution. It first samples the scene distribution by generating 2D fluoroscopy images with wires positioned according to a specified input mask, and then samples the motion distribution by progressively generating subsequent frames, ensuring frame-to-frame coherence through a frame-consistency strategy. A segmentation-guided mechanism further refines the process by adjusting wire contrast, ensuring a diverse range of visibility in the synthesized image. Evaluation on a fluoroscopy dataset confirms the superior quality of the generated videos and shows significant improvements in guidewire segmentation.</abstract><doi>10.48550/arxiv.2412.16050</doi><oa>free_for_read</oa></addata></record>
fulltext	fulltext_linktorsrc
identifier	DOI: 10.48550/arxiv.2412.16050
ispartof
issn
language	eng
recordid	cdi_arxiv_primary_2412_16050
source	arXiv.org
subjects	Computer Science - Artificial Intelligence Computer Science - Computer Vision and Pattern Recognition
title	Label-Efficient Data Augmentation with Video Diffusion Models for Guidewire Segmentation in Cardiac Fluoroscopy
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-05T08%3A59%3A44IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-arxiv_GOX&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Label-Efficient%20Data%20Augmentation%20with%20Video%20Diffusion%20Models%20for%20Guidewire%20Segmentation%20in%20Cardiac%20Fluoroscopy&rft.au=Pan,%20Shaoyan&rft.date=2024-12-20&rft_id=info:doi/10.48550/arxiv.2412.16050&rft_dat=%3Carxiv_GOX%3E2412_16050%3C/arxiv_GOX%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true