Inverse discounted-based LQR algorithm for learning human movement behaviors
Recently, there has been an increasing interest towards understanding human movement behaviors. In this regard, one of the approaches is to retrieve the unknown underlying objective function that the human has to optimize while achieving a certain movement behavior. Existing research of behavioral u...
Gespeichert in:
Veröffentlicht in: | Applied intelligence (Dordrecht, Netherlands) Netherlands), 2019-04, Vol.49 (4), p.1489-1501 |
---|---|
Hauptverfasser: | , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Recently, there has been an increasing interest towards understanding human movement behaviors. In this regard, one of the approaches is to retrieve the unknown underlying objective function that the human has to optimize while achieving a certain movement behavior. Existing research of behavioral understanding merely depends on predefined optimality criteria, where the minimum time, minimum variance or/and minimum effort are mainly adopted. These criteria are assumed to be constant, where the human is assumed to have the same preferences during the movement duration. However, in this paper, the optimality criteria underlying the kinematic characteristics of a certain human behavior are assumed to be exponentially discounted to account for the change in the human preferences that could happen while achieving this behavior. A new Inverse Discounted-based Linear Quadratic Regulator (ID-LQR) algorithm is developed in the light of Inverse Optimal Control (IOC) framework to find out the discounted cost function that could reproduce the measured human behavior perfectly. Meanwhile, an Incremental version of the ID-LQR algorithm is proposed to continuously refine the so far learned cost function in the case of sequentially presented demonstrations. The saccadic eye gaze movement is studied as an example to quantify both the proposed ID-LQR and Inverse ID-LQR approaches. Simulation results are encouraging and show that the saccadic trajectories generated by ID-LQR approach match the experimental data in many aspects, including position and velocity profiles of saccades. Moreover, when it is assessed by a subsequent set of scenarios, the incremental ID-LQR algorithm confirms its capability to generalize the so far retrieved cost function for the unseen saccadic demonstrations. |
---|---|
ISSN: | 0924-669X 1573-7497 |
DOI: | 10.1007/s10489-018-1331-y |