Neural Task Planning With AND-OR Graph Representations

This paper focuses on semantic task planning, that is, predicting a sequence of actions toward accomplishing a specific task under a certain scene, which is a new problem in computer vision research. The primary challenges are how to model the task-specific knowledge and how to integrate this knowle...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	IEEE transactions on multimedia 2019-04, Vol.21 (4), p.1022-1034
Hauptverfasser:	Chen, Tianshui, Chen, Riquan, Nie, Lin, Luo, Xiaonan, Liu, Xiaobai, Lin, Liang
Format:	Artikel
Sprache:	eng
Schlagworte:	action prediction Computer vision Graph representations Graphical representations Mathematical models Planning recurrent neural network Recurrent neural networks Robots Scene understanding Semantics Task analysis task planning Training Visualization
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	This paper focuses on semantic task planning, that is, predicting a sequence of actions toward accomplishing a specific task under a certain scene, which is a new problem in computer vision research. The primary challenges are how to model the task-specific knowledge and how to integrate this knowledge into the learning procedure. In this paper, we propose training a recurrent long short-term memory (LSTM) network to address this problem, that is, taking a scene image (including prelocated objects) and the specified task as input and recurrently predicting action sequences. However, training such a network generally requires large numbers of annotated samples to cover the semantic space (e.g., diverse action decomposition and ordering). To overcome this issue, we introduce a knowledge and - or graph (AOG) for task description, which hierarchically represents a task as atomic actions. With this AOG representation, we can produce many valid samples (i.e., action sequences according to common sense) by training another auxiliary LSTM network with a small set of annotated samples. Furthermore, these generated samples (i.e., task-oriented action sequences) effectively facilitate training of the model for semantic task planning. In our experiments, we create a new dataset that contains diverse daily tasks and extensively evaluates the effectiveness of our approach.
ISSN:	1520-9210 1941-0077
DOI:	10.1109/TMM.2018.2870062