Temporal attention learning for action quality assessment in sports video

This paper proposes an end-to-end temporal attention learning method to improve the performance of action quality assessment in sports video. For temporal weighted training, an attention-learning module is built to simulate the attention mechanism and judgement preference of human perception on acti...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Signal, image and video processing image and video processing, 2021-10, Vol.15 (7), p.1575-1583
Hauptverfasser: Lei, Qing, Zhang, Hongbo, Du, Jixiang
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:This paper proposes an end-to-end temporal attention learning method to improve the performance of action quality assessment in sports video. For temporal weighted training, an attention-learning module is built to simulate the attention mechanism and judgement preference of human perception on action quality assessment. The weights are learned based on the loss of the segmented prediction errors and used to balance the significance of segmented features. We evaluate the proposed method on diving and gym-vault action of the benchmark AQA-7 dataset. The experimental results show that the proposed attention-aware feature training method is more effective than temporal aggregation and existing temporal relationship learning methods. Furthermore, only using the distance loss between the predicated score and the ground-truth score, without considering the ranking loss of different videos for training, this paper has achieved the state-of-the-art performance on both of the spearman rank correlation and mean Euclidean distance of the predicted scores against the judge’s scores.
ISSN:1863-1703
1863-1711
DOI:10.1007/s11760-021-01890-w