INFORMATION PROCESSING DEVICE, INFORMATION PROCESSING METHOD AND LEARNING DEVICE

To provide an information processing device which allows even a simpler computer resource to easily perform processing related to inverse reinforcement learning.SOLUTION: An information processing device comprises: a simulation image generation unit (12:S12) which generates image information indicat...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: IZUMINA KATSURO, WATANABE MASAHIKO
Format: Patent
Sprache:eng ; jpn
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:To provide an information processing device which allows even a simpler computer resource to easily perform processing related to inverse reinforcement learning.SOLUTION: An information processing device comprises: a simulation image generation unit (12:S12) which generates image information indicative of a simulation image including an abstraction road map, a first type symbol representing an object vehicle and a second type symbol representing a movable body by distinguishing it according to the attribute thereof; an environmental information acquisition unit (12:S13) which acquires environmental information indicative of a situation of movement of the movable body as a situation of movement of the second type symbol; an action information acquisition unit (12:S14) which acquires expert action information indicative of the travel mode of the object vehicle as the movement mode of the first type symbol; a processor (13) which takes the environmental information and the action information as input information and decides reward information as output information according to a processing rule; and a learning control unit (14) which updates the processing rule in the processor according to a learning algorithm.SELECTED DRAWING: Figure 1 【課題】より簡素なコンピュータ資源によっても、逆強化学習に係る処理を容易に行うことのできる情報処理装置を提供するものである。【解決手段】抽象化道路地図、対象車両を表す第1種記号、及び移動体をその属性に応じて区別して表す第2種記号を含むシミュレーション画像を表す画像情報を生成するシミュレーション画像生成部(12:S12)と、前記移動体が移動する状況を前記第2種記号が移動する状況として表した環境情報を取得する環境情報取得部(12:S13)と、前記対象車両の走行態様を前記第1種記号の移動態様として表したエキスパート行動情報を取得する行動情報取得部(12:S14)と、環境情報と行動情報とを入力情報とし、処理規則に従って、報酬情報を出力情報として決定する処理器(13)と、学習アルゴリズムに従って前記処理器における前記処理規則を更新させる学習制御部(14)とを有する構成となる。【選択図】図1