Local attention transformer-based full-view finger-fein identification

Multi-view finger-vein recognition technology has attracted increasing attentions in recent years. Despite recent advances in the multi-view finger-vein identification, existing solutions employ multiple monocular cameras from different views to record two-dimensional (2D) projections of 3D vein ves...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE transactions on circuits and systems for video technology 2022-12, p.1-1
Hauptverfasser: Qin, Huafeng, Hu, Rongshan, El Yacoubi, Mounim, Li, Yantao, Gao, Xinbo
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Multi-view finger-vein recognition technology has attracted increasing attentions in recent years. Despite recent advances in the multi-view finger-vein identification, existing solutions employ multiple monocular cameras from different views to record two-dimensional (2D) projections of 3D vein vessels, which causes the following problems: 1) 2D images collected from limited views (two or three views) are insufficient for robust 3D vein vessel feature representation. Furthermore, image sequences of the same finger acquired from different views usually show significant differences. As a result, the existing works are still sensitive to positional variations of the fingers, specifically those caused by finger roll movements. 2) Using multiple cameras can lead to increased costs. Moreover, it is impossible to employ several cameras to acquire full-view images because of the limited space on capturing devices. To address the above issues, we present FV-LT, a Full-View Finger-Vein identification system based on a Local attention Transformer, by implementing an image acquisition device with a single camera. First, we design and implement a finger-vein acquisition prototype device that utilizes a single camera and a LED group to rotate along a finger for full-view image collection. This allows capturing all vein patterns concealed beneath human skin to form a complete representation of finger features. Second, given the full-view vein images, we propose a local attention transformer-based approach to extract dependency features of a token (a patch or an image) on its neighborhood’s tokens among image patches and among full-view images, respectively. These dependency features are shown to be robust to positional variations induced by finger rolls. Based on the public database of full-view finger-vein images captured by our designed device, we verify the performance of the proposed FV-LT. The experimental results show that FV-LT significantly outperforms existing 2D/multi-view based approaches with respect to improving the tolerance against finger roll and achieving the state-of-the-art identification accuracy.
ISSN:1051-8215
DOI:10.1109/TCSVT.2022.3227385