Human Interaction Understanding With Consistency-Aware Learning

Compared with the progress made on human activity classification, much less success has been achieved on human interaction understanding (HIU). Apart from the latter task is much more challenging, the main causation is that recent approaches learn human interactive relations via shallow graphical re...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE transactions on pattern analysis and machine intelligence 2023-10, Vol.45 (10), p.11898-11914
Hauptverfasser: Meng, Jiajun, Wang, Zhenhua, Ying, Kaining, Zhang, Jianhua, Guo, Dongyan, Zhang, Zhen, Shi, Javen Qinfeng, Chen, Shengyong
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Compared with the progress made on human activity classification, much less success has been achieved on human interaction understanding (HIU). Apart from the latter task is much more challenging, the main causation is that recent approaches learn human interactive relations via shallow graphical representations, which are inadequate to model complicated human interactive-relations. This paper proposes a deep consistency-aware framework aiming at tackling the grouping and labelling inconsistencies in HIU. This framework consists of three components, including a backbone CNN to extract image features, a factor graph network to implicitly learn higher-order consistencies among labelling and grouping variables, and a consistency-aware reasoning module to explicitly enforcing consistencies. The last module is inspired by our key observation that the consistency-aware reasoning bias can be embedded into an energy function or a particular loss function, minimizing which delivers consistent predictions. An efficient mean-field inference algorithm is proposed, such that all modules of our network could be trained in an end-to-end fashion. Experimental results demonstrate that the two proposed consistency-learning modules complement each other, and both make considerable contributions in achieving leading performance on three benchmarks of HIU. The effectiveness of the proposed approach is further validated by experiments on detecting human-object interactions.
ISSN:0162-8828
1939-3539
2160-9292
DOI:10.1109/TPAMI.2023.3280906