Video-based neonatal pain expression recognition with cross-stream attention
Facial expression is considered as the most specific pain indicator, which has been effectively employed for neonatal pain assessment. Since neonates cannot verbalize their subjective pain experiences, recognizing neonatal pain expression automatically has great value and meaning. The Two-Stream Con...
Gespeichert in:
Veröffentlicht in: | Multimedia tools and applications 2024, Vol.83 (2), p.4667-4690 |
---|---|
Hauptverfasser: | , , , , , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Facial expression is considered as the most specific pain indicator, which has been effectively employed for neonatal pain assessment. Since neonates cannot verbalize their subjective pain experiences, recognizing neonatal pain expression automatically has great value and meaning. The Two-Stream Convolutional Network (TS-ConvNet) can effectively aggregate the spatial and temporal information in the neonatal pain expression videos by adopting the two-stream structure. However, traditional TS-ConvNet is unable to exploit the correlation across the spatial stream and temporal stream, due to the spatial and temporal streams being independent of each other. To overcome this drawback, this paper presents a Cross-Stream Attention (CSA) mechanism with non-local operations to model the correlation of the two streams and proposes a new model called TS-ConvNet with CSA units (TSCN-CSA) by introducing CSA mechanism into TS-ConvNet. TSCN-CSA enables spatial information and temporal information to interact with each other at different semantic levels, and employs ResNet-50 pre-trained on ImageNet as the backbone to extract neonatal pain expression features. In addition, to evaluate the performance of the proposed model, we collected a video dataset named Dynamic Facial Expression of Pain in Neonates (DFEPN), which is composed of 1897 video clips with four categories of expression labels: calmness, crying, moderate pain, and severe pain. The experimental results on the DFEPN dataset demonstrate that CSA units have a positive effect and improve the accuracy of TS-ConvNets for neonatal pain expression recognition. As a result, the proposed method achieves the promising recognition performance (66.20%) for the four categories based neonatal pain expression recognition. |
---|---|
ISSN: | 1380-7501 1573-7721 |
DOI: | 10.1007/s11042-023-15403-z |