Inverse discounted-based LQR algorithm for learning human movement behaviors

Recently, there has been an increasing interest towards understanding human movement behaviors. In this regard, one of the approaches is to retrieve the unknown underlying objective function that the human has to optimize while achieving a certain movement behavior. Existing research of behavioral u...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Applied intelligence (Dordrecht, Netherlands) Netherlands), 2019-04, Vol.49 (4), p.1489-1501
Hauptverfasser: El-Hussieny, Haitham, Ryu, Jee-Hwan
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Recently, there has been an increasing interest towards understanding human movement behaviors. In this regard, one of the approaches is to retrieve the unknown underlying objective function that the human has to optimize while achieving a certain movement behavior. Existing research of behavioral understanding merely depends on predefined optimality criteria, where the minimum time, minimum variance or/and minimum effort are mainly adopted. These criteria are assumed to be constant, where the human is assumed to have the same preferences during the movement duration. However, in this paper, the optimality criteria underlying the kinematic characteristics of a certain human behavior are assumed to be exponentially discounted to account for the change in the human preferences that could happen while achieving this behavior. A new Inverse Discounted-based Linear Quadratic Regulator (ID-LQR) algorithm is developed in the light of Inverse Optimal Control (IOC) framework to find out the discounted cost function that could reproduce the measured human behavior perfectly. Meanwhile, an Incremental version of the ID-LQR algorithm is proposed to continuously refine the so far learned cost function in the case of sequentially presented demonstrations. The saccadic eye gaze movement is studied as an example to quantify both the proposed ID-LQR and Inverse ID-LQR approaches. Simulation results are encouraging and show that the saccadic trajectories generated by ID-LQR approach match the experimental data in many aspects, including position and velocity profiles of saccades. Moreover, when it is assessed by a subsequent set of scenarios, the incremental ID-LQR algorithm confirms its capability to generalize the so far retrieved cost function for the unseen saccadic demonstrations.
ISSN:0924-669X
1573-7497
DOI:10.1007/s10489-018-1331-y