Automatic vs. Human Recognition of Pain Intensity from Facial Expression on the X-ITE Pain Database

Prior work on automated methods demonstrated that it is possible to recognize pain intensity from frontal faces in videos, while there is an assumption that humans are very adept at this task compared to machines. In this paper, we investigate whether such an assumption is correct by comparing the r...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Sensors (Basel, Switzerland) Switzerland), 2021-05, Vol.21 (9), p.3273, Article 3273
Hauptverfasser: Othman, Ehsan, Werner, Philipp, Saxen, Frerk, Al-Hamadi, Ayoub, Gruss, Sascha, Walter, Steffen
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Prior work on automated methods demonstrated that it is possible to recognize pain intensity from frontal faces in videos, while there is an assumption that humans are very adept at this task compared to machines. In this paper, we investigate whether such an assumption is correct by comparing the results achieved by two human observers with the results achieved by a Random Forest classifier (RFc) baseline model (called RFc-BL) and by three proposed automated models. The first proposed model is a Random Forest classifying descriptors of Action Unit (AU) time series; the second is a modified MobileNetV2 CNN classifying face images that combine three points in time; and the third is a custom deep network combining two CNN branches using the same input as for MobileNetV2 plus knowledge of the RFc. We conduct experiments with X-ITE phasic pain database, which comprises videotaped responses to heat and electrical pain stimuli, each of three intensities. Distinguishing these six stimulation types plus no stimulation was the main 7-class classification task for the human observers and automated approaches. Further, we conducted reduced 5-class and 3-class classification experiments, applied Multi-task learning, and a newly suggested sample weighting method. Experimental results show that the pain assessments of the human observers are significantly better than guessing and perform better than the automatic baseline approach (RFc-BL) by about 1%; however, the human performance is quite poor due to the challenge that pain that is ethically allowed to be induced in experimental studies often does not show up in facial reaction. We discovered that downweighting those samples during training improves the performance for all samples. The proposed RFc and two-CNNs models (using the proposed sample weighting) significantly outperformed the human observer by about 6% and 7%, respectively.
ISSN:1424-8220
1424-8220
DOI:10.3390/s21093273