De-redundancy in wireless capsule endoscopy video sequences using correspondence matching and motion analysis
Handling wireless capsule endoscopy (WCE) de-redundancy is a challenging task. This paper proposes a scheme, called SS-VCF-Der , to consider applying a flow field estimation between two successive WCE frames to WCE imaging motion analysis and then address the WCE de-redundancy problem based on the r...
Gespeichert in:
Veröffentlicht in: | Multimedia tools and applications 2024-02, Vol.83 (7), p.21171-21195 |
---|---|
Hauptverfasser: | , , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Handling wireless capsule endoscopy (WCE) de-redundancy is a challenging task. This paper proposes a scheme, called
SS-VCF-Der
, to consider applying a flow field estimation between two successive WCE frames to WCE imaging motion analysis and then address the WCE de-redundancy problem based on the results of the motion analysis.
To this end
, we intend to exploit a self-supervised technique to learn interframe visual correspondence representations from large amounts of raw WCE videos without manual human supervision, and predict the flow field.
Our key idea
is to use the natural spatial-temporal coherence in color and cycle consistency in time in WCE videos as free supervisory signal to learn WCE visual correspondence relations from scratch. We call this procedure self-supervised visual correspondence flow learning (
SS-VCF
).
At training time
, we use three losses: forward-backward cycle-consistency loss, visual similarity loss, and color loss, to train and optimize model.
At test time
, we use the acquired representation to generate a flow field for analyzing pixel movement between two successive WCE frames. Furthermore, according to the resulting flow field estimation, we compute the motion intensity of motion fields between two successive frames, and use our proposed de-redundancy method, namely
SS-VCF-MI
, to select some frames as key ones with distinct scene changes in local neighborhood so as to achieve the purpose of de-redundancy. Extensive experiments on our collected WCE-2019-Video dataset show that our scheme can achieve a promising result, verifying its effectiveness on the visual correspondence representation and redundancy removal for WCE videos. |
---|---|
ISSN: | 1573-7721 1380-7501 1573-7721 |
DOI: | 10.1007/s11042-023-15530-7 |