Reinforcement Learning With Model-Based Assistance for Shape Control in Sendzimir Rolling Mills

As one of the most popular tandem cold rolling mills, the Sendzimir rolling mill (ZRM) aims to obtain a flat steel strip shape by properly allocating the rolling pressure. To improve the performance of the ZRM, it is meaningful to adopt recently emerging deep reinforcement learning (DRL) that is pow...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE transactions on control systems technology 2023-07, Vol.31 (4), p.1867-1874
Hauptverfasser: Park, Jonghyuk, Kim, Beomsu, Han, Soohee
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:As one of the most popular tandem cold rolling mills, the Sendzimir rolling mill (ZRM) aims to obtain a flat steel strip shape by properly allocating the rolling pressure. To improve the performance of the ZRM, it is meaningful to adopt recently emerging deep reinforcement learning (DRL) that is powerful for difficult-to-solve and challenging problems. However, the direct application of DRL techniques may be impractical because of a serious singularity, partial observability, and even safety issues inherent in mill systems. In this brief, we propose an effective hybridization approach that integrates a model-based assistant into model-free DRL to resolve such practical issues. For the model-based assistant, a model-based optimization problem is first constructed and solved for the static part of the mill model. Then, the obtained static model-based coarse assistant, or controller, is improved by the proposed reinforcement learning, considering the remaining dynamic part of the mill model. The serious singularity can be resolved using the model-based approach, and the issue of partial observability is addressed by the long short-term memory (LSTM) state estimator in the proposed method. In simulation results, the proposed method successfully learns a highly performing policy for the ZRM, achieving a higher reward than pure model-free DRL. It is also observed that the proposed method can safely improve the shape controller of the mill system. The demonstration results strongly confirm the high applicability of DRL to other cold multiroll mills, such as four-high, six-high, and cluster mills.
ISSN:1063-6536
1558-0865
DOI:10.1109/TCST.2022.3227502