Recalling Unknown Manipulations by Spontaneously Sharing Actions with Similar Objects in Observation Based Learning

This paper proposes a method for a robot to recall multiple action candidates for an object by learning object manipulations based on observation of human actions. When learning, multiple answers to a single input in supervised regression manner, it is usually necessary to map all correct answers to...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Kikai Gakkai ronbunshū = Transactions of the Japan Society of Mechanical Engineers 2023, pp.22-00274
Hauptverfasser: SANADA, Makoto, MATSUO, Tadashi, SHIMADA, Nobutaka, SHIRAI, Yoshiaki
Format: Artikel
Sprache:jpn
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:This paper proposes a method for a robot to recall multiple action candidates for an object by learning object manipulations based on observation of human actions. When learning, multiple answers to a single input in supervised regression manner, it is usually necessary to map all correct answers to the same input. However, only one action can be observed for an object at a time in observing object manipulations, and other possible actions are not always observed for the identical object. It is, therefore, important to automatically share various observed actions between similar-shaped objects by recognizing common shape cues among individual objects. The proposed method learns the code descriptions of object shapes by a variational auto-encoder (VAE) with an object image as input data, and the code descriptions of actions by a conditional VAE with object shape as a condition and an action as input data. Since the action is unknown recall target, it is desirable to obtain the code description of the action from only the object shape during recalling. The distribution of the code description of actions conditioned by input object shape on the obtained code description space is obtained by marginalization of the distribution learned by the encoder part of CVAE. However, since this marginalization is difficult to analytically and numerically operate, a deep regression model that “imitates” this marginal distribution is trained by using a maximum likelihood method based on sampling. Common actions of similar-shaped objects are shared among the similar objects in this “marginalization by imitation” process. Various possible actions for the input object shape can be recalled by repeatedly sampling from the imitated marginal distribution. This paper describes the results of experiment using actual object images and manipulation actions, and demonstrates the effectiveness of the proposed method.
ISSN:2187-9761
DOI:10.1299/transjsme.22-00274