De-redundancy in wireless capsule endoscopy video sequences using correspondence matching and motion analysis

Handling wireless capsule endoscopy (WCE) de-redundancy is a challenging task. This paper proposes a scheme, called SS-VCF-Der , to consider applying a flow field estimation between two successive WCE frames to WCE imaging motion analysis and then address the WCE de-redundancy problem based on the r...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Multimedia tools and applications 2024-02, Vol.83 (7), p.21171-21195
Hauptverfasser: Lan, Libin, Ye, Chunxiao, Liao, Chao, Wang, Chengliang, Feng, Xin
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Handling wireless capsule endoscopy (WCE) de-redundancy is a challenging task. This paper proposes a scheme, called SS-VCF-Der , to consider applying a flow field estimation between two successive WCE frames to WCE imaging motion analysis and then address the WCE de-redundancy problem based on the results of the motion analysis. To this end , we intend to exploit a self-supervised technique to learn interframe visual correspondence representations from large amounts of raw WCE videos without manual human supervision, and predict the flow field. Our key idea is to use the natural spatial-temporal coherence in color and cycle consistency in time in WCE videos as free supervisory signal to learn WCE visual correspondence relations from scratch. We call this procedure self-supervised visual correspondence flow learning ( SS-VCF ). At training time , we use three losses: forward-backward cycle-consistency loss, visual similarity loss, and color loss, to train and optimize model. At test time , we use the acquired representation to generate a flow field for analyzing pixel movement between two successive WCE frames. Furthermore, according to the resulting flow field estimation, we compute the motion intensity of motion fields between two successive frames, and use our proposed de-redundancy method, namely SS-VCF-MI , to select some frames as key ones with distinct scene changes in local neighborhood so as to achieve the purpose of de-redundancy. Extensive experiments on our collected WCE-2019-Video dataset show that our scheme can achieve a promising result, verifying its effectiveness on the visual correspondence representation and redundancy removal for WCE videos.
ISSN:1573-7721
1380-7501
1573-7721
DOI:10.1007/s11042-023-15530-7