Lightweight real-time hand segmentation leveraging MediaPipe landmark detection

Real-time hand segmentation is a key process in applications that require human–computer interaction, such as gesture recognition or augmented reality systems. However, the infinite shapes and orientations that hands can adopt, their variability in skin pigmentation and the self-occlusions that cont...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Virtual reality : the journal of the Virtual Reality Society 2023-12, Vol.27 (4), p.3125-3132
Hauptverfasser: Sánchez-Brizuela, Guillermo, Cisnal, Ana, de la Fuente-López, Eusebio, Fraile, Juan-Carlos, Pérez-Turiel, Javier
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Real-time hand segmentation is a key process in applications that require human–computer interaction, such as gesture recognition or augmented reality systems. However, the infinite shapes and orientations that hands can adopt, their variability in skin pigmentation and the self-occlusions that continuously appear in images make hand segmentation a truly complex problem, especially with uncontrolled lighting conditions and backgrounds. The development of robust, real-time hand segmentation algorithms is essential to achieve immersive augmented reality and mixed reality experiences by correctly interpreting collisions and occlusions. In this paper, we present a simple but powerful algorithm based on the MediaPipe Hands solution, a highly optimized neural network. The algorithm processes the landmarks provided by MediaPipe using morphological and logical operators to obtain the masks that allow dynamic updating of the skin color model. Different experiments were carried out comparing the influence of the color space on skin segmentation, with the CIELab color space chosen as the best option. An average intersection over union of 0.869 was achieved on the demanding Ego2Hands dataset running at 90 frames per second on a conventional computer without any hardware acceleration. Finally, the proposed segmentation procedure was implemented in an augmented reality application to add hand occlusion for improved user immersion. An open-source implementation of the algorithm is publicly available at https://github.com/itap-robotica-medica/lightweight-hand-segmentation .
ISSN:1359-4338
1434-9957
DOI:10.1007/s10055-023-00858-0