Reinforcement Learning With Model-Based Assistance for Shape Control in Sendzimir Rolling Mills
As one of the most popular tandem cold rolling mills, the Sendzimir rolling mill (ZRM) aims to obtain a flat steel strip shape by properly allocating the rolling pressure. To improve the performance of the ZRM, it is meaningful to adopt recently emerging deep reinforcement learning (DRL) that is pow...
Gespeichert in:
Veröffentlicht in: | IEEE transactions on control systems technology 2023-07, Vol.31 (4), p.1867-1874 |
---|---|
Hauptverfasser: | , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | As one of the most popular tandem cold rolling mills, the Sendzimir rolling mill (ZRM) aims to obtain a flat steel strip shape by properly allocating the rolling pressure. To improve the performance of the ZRM, it is meaningful to adopt recently emerging deep reinforcement learning (DRL) that is powerful for difficult-to-solve and challenging problems. However, the direct application of DRL techniques may be impractical because of a serious singularity, partial observability, and even safety issues inherent in mill systems. In this brief, we propose an effective hybridization approach that integrates a model-based assistant into model-free DRL to resolve such practical issues. For the model-based assistant, a model-based optimization problem is first constructed and solved for the static part of the mill model. Then, the obtained static model-based coarse assistant, or controller, is improved by the proposed reinforcement learning, considering the remaining dynamic part of the mill model. The serious singularity can be resolved using the model-based approach, and the issue of partial observability is addressed by the long short-term memory (LSTM) state estimator in the proposed method. In simulation results, the proposed method successfully learns a highly performing policy for the ZRM, achieving a higher reward than pure model-free DRL. It is also observed that the proposed method can safely improve the shape controller of the mill system. The demonstration results strongly confirm the high applicability of DRL to other cold multiroll mills, such as four-high, six-high, and cluster mills. |
---|---|
ISSN: | 1063-6536 1558-0865 |
DOI: | 10.1109/TCST.2022.3227502 |