Use of PID control during Education in Reinforcement Learning on Two Wheel Balance Robot

This study’s primary objective was to try to shorten the training time of the Reinforcement Learning (RL) method, which is one of the Machine Learning methods, by using the proportional-integral-derivative (PID) control method during training. In this study, a balancing robot with two wheels that ca...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Gazi Üniversitesi Fen Bilimleri Dergisi 2021-12, Vol.9 (4), p.597-607
Hauptverfasser: ATAÇ, Emrah, YILDIZ, Kazım, ÜLKÜ, Eyüp Emre
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:This study’s primary objective was to try to shorten the training time of the Reinforcement Learning (RL) method, which is one of the Machine Learning methods, by using the proportional-integral-derivative (PID) control method during training. In this study, a balancing robot with two wheels that can be controlled independently on the same axis is used. While the robot is in balance, the RL software block follows how the PID block maintains the balance, and the RL blog learned how to behave against disturbing factors without physical falling / rising. In the training of RL, it is necessary to create approximately 500 policy / reward / path equations between the current state and future state matrices. Obviously, the amount of equations will increase considerably when subjects such as old position and acceleration are added. Approximately 1000 trial / error is required for training purposes. This means many falling / rising cycles. With the method we present, the RL block has learned to keep the robot in balance without falling and requiring human intervention in 900 trials. This shortened the training time by about 60%.
ISSN:2147-9526
2147-9526
DOI:10.29109/gujsc.955562