Reinforcement Learning With Model-Based Assistance for Shape Control in Sendzimir Rolling Mills
As one of the most popular tandem cold rolling mills, the Sendzimir rolling mill (ZRM) aims to obtain a flat steel strip shape by properly allocating the rolling pressure. To improve the performance of the ZRM, it is meaningful to adopt recently emerging deep reinforcement learning (DRL) that is pow...
Gespeichert in:
Veröffentlicht in: | IEEE transactions on control systems technology 2023-07, Vol.31 (4), p.1867-1874 |
---|---|
Hauptverfasser: | , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | 1874 |
---|---|
container_issue | 4 |
container_start_page | 1867 |
container_title | IEEE transactions on control systems technology |
container_volume | 31 |
creator | Park, Jonghyuk Kim, Beomsu Han, Soohee |
description | As one of the most popular tandem cold rolling mills, the Sendzimir rolling mill (ZRM) aims to obtain a flat steel strip shape by properly allocating the rolling pressure. To improve the performance of the ZRM, it is meaningful to adopt recently emerging deep reinforcement learning (DRL) that is powerful for difficult-to-solve and challenging problems. However, the direct application of DRL techniques may be impractical because of a serious singularity, partial observability, and even safety issues inherent in mill systems. In this brief, we propose an effective hybridization approach that integrates a model-based assistant into model-free DRL to resolve such practical issues. For the model-based assistant, a model-based optimization problem is first constructed and solved for the static part of the mill model. Then, the obtained static model-based coarse assistant, or controller, is improved by the proposed reinforcement learning, considering the remaining dynamic part of the mill model. The serious singularity can be resolved using the model-based approach, and the issue of partial observability is addressed by the long short-term memory (LSTM) state estimator in the proposed method. In simulation results, the proposed method successfully learns a highly performing policy for the ZRM, achieving a higher reward than pure model-free DRL. It is also observed that the proposed method can safely improve the shape controller of the mill system. The demonstration results strongly confirm the high applicability of DRL to other cold multiroll mills, such as four-high, six-high, and cluster mills. |
doi_str_mv | 10.1109/TCST.2022.3227502 |
format | Article |
fullrecord | <record><control><sourceid>proquest_RIE</sourceid><recordid>TN_cdi_crossref_primary_10_1109_TCST_2022_3227502</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>9991941</ieee_id><sourcerecordid>2828941151</sourcerecordid><originalsourceid>FETCH-LOGICAL-c293t-d7098f557d13371572c02eea900106a6a6b4aea5f54205d3172c1b1b4a08547c3</originalsourceid><addsrcrecordid>eNo9kE9LAzEQxRdRsFY_gHgJeN6aSTa7m2Nd_ActQlvxGNLdWZuyTWqyPeinN0uLzGGG4ffeDC9JboFOAKh8WFXL1YRRxiacsUJQdpaMQIgypWUuzuNMc57mgueXyVUIW0ohE6wYJWqBxrbO17hD25MZam-N_SKfpt-QuWuwSx91wIZMQzCh17ZGEnGy3Og9ksrZ3ruOGEuWaJtfszOeLFzXDRZz03XhOrlodRfw5tTHycfz06p6TWfvL2_VdJbWTPI-bQoqy1aIogHOCxAFqylD1DI-SnMda51p1KIVGaOi4RABWENc0lJkRc3Hyf3Rd-_d9wFDr7bu4G08qVjJSpkBCIgUHKnauxA8tmrvzU77HwVUDTmqIUc15KhOOUbN3VFjEPGfl1JCNOV_zFlt1Q</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2828941151</pqid></control><display><type>article</type><title>Reinforcement Learning With Model-Based Assistance for Shape Control in Sendzimir Rolling Mills</title><source>IEEE Electronic Library (IEL)</source><creator>Park, Jonghyuk ; Kim, Beomsu ; Han, Soohee</creator><creatorcontrib>Park, Jonghyuk ; Kim, Beomsu ; Han, Soohee</creatorcontrib><description>As one of the most popular tandem cold rolling mills, the Sendzimir rolling mill (ZRM) aims to obtain a flat steel strip shape by properly allocating the rolling pressure. To improve the performance of the ZRM, it is meaningful to adopt recently emerging deep reinforcement learning (DRL) that is powerful for difficult-to-solve and challenging problems. However, the direct application of DRL techniques may be impractical because of a serious singularity, partial observability, and even safety issues inherent in mill systems. In this brief, we propose an effective hybridization approach that integrates a model-based assistant into model-free DRL to resolve such practical issues. For the model-based assistant, a model-based optimization problem is first constructed and solved for the static part of the mill model. Then, the obtained static model-based coarse assistant, or controller, is improved by the proposed reinforcement learning, considering the remaining dynamic part of the mill model. The serious singularity can be resolved using the model-based approach, and the issue of partial observability is addressed by the long short-term memory (LSTM) state estimator in the proposed method. In simulation results, the proposed method successfully learns a highly performing policy for the ZRM, achieving a higher reward than pure model-free DRL. It is also observed that the proposed method can safely improve the shape controller of the mill system. The demonstration results strongly confirm the high applicability of DRL to other cold multiroll mills, such as four-high, six-high, and cluster mills.</description><identifier>ISSN: 1063-6536</identifier><identifier>EISSN: 1558-0865</identifier><identifier>DOI: 10.1109/TCST.2022.3227502</identifier><identifier>CODEN: IETTE2</identifier><language>eng</language><publisher>New York: IEEE</publisher><subject>Actor-critic policy gradient ; Actuators ; Cluster mills ; cold rolling mill ; Cold rolling mills ; Cold tandem mills ; Controllers ; Deep learning ; Heuristic algorithms ; Metal strips ; Observability ; Optimization ; partially observable Markov decision process (MDP) ; reinforcement learning ; Sendzimir rolling mill (ZRM) ; Sendzimir Z mills ; Shape ; Shape control ; Singularities ; State estimation ; Static models ; Steel ; Strip steel ; Strips ; System effectiveness</subject><ispartof>IEEE transactions on control systems technology, 2023-07, Vol.31 (4), p.1867-1874</ispartof><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2023</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c293t-d7098f557d13371572c02eea900106a6a6b4aea5f54205d3172c1b1b4a08547c3</citedby><cites>FETCH-LOGICAL-c293t-d7098f557d13371572c02eea900106a6a6b4aea5f54205d3172c1b1b4a08547c3</cites><orcidid>0000-0003-0425-0958 ; 0000-0002-9831-3499</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/9991941$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>314,780,784,796,27924,27925,54758</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/9991941$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc></links><search><creatorcontrib>Park, Jonghyuk</creatorcontrib><creatorcontrib>Kim, Beomsu</creatorcontrib><creatorcontrib>Han, Soohee</creatorcontrib><title>Reinforcement Learning With Model-Based Assistance for Shape Control in Sendzimir Rolling Mills</title><title>IEEE transactions on control systems technology</title><addtitle>TCST</addtitle><description>As one of the most popular tandem cold rolling mills, the Sendzimir rolling mill (ZRM) aims to obtain a flat steel strip shape by properly allocating the rolling pressure. To improve the performance of the ZRM, it is meaningful to adopt recently emerging deep reinforcement learning (DRL) that is powerful for difficult-to-solve and challenging problems. However, the direct application of DRL techniques may be impractical because of a serious singularity, partial observability, and even safety issues inherent in mill systems. In this brief, we propose an effective hybridization approach that integrates a model-based assistant into model-free DRL to resolve such practical issues. For the model-based assistant, a model-based optimization problem is first constructed and solved for the static part of the mill model. Then, the obtained static model-based coarse assistant, or controller, is improved by the proposed reinforcement learning, considering the remaining dynamic part of the mill model. The serious singularity can be resolved using the model-based approach, and the issue of partial observability is addressed by the long short-term memory (LSTM) state estimator in the proposed method. In simulation results, the proposed method successfully learns a highly performing policy for the ZRM, achieving a higher reward than pure model-free DRL. It is also observed that the proposed method can safely improve the shape controller of the mill system. The demonstration results strongly confirm the high applicability of DRL to other cold multiroll mills, such as four-high, six-high, and cluster mills.</description><subject>Actor-critic policy gradient</subject><subject>Actuators</subject><subject>Cluster mills</subject><subject>cold rolling mill</subject><subject>Cold rolling mills</subject><subject>Cold tandem mills</subject><subject>Controllers</subject><subject>Deep learning</subject><subject>Heuristic algorithms</subject><subject>Metal strips</subject><subject>Observability</subject><subject>Optimization</subject><subject>partially observable Markov decision process (MDP)</subject><subject>reinforcement learning</subject><subject>Sendzimir rolling mill (ZRM)</subject><subject>Sendzimir Z mills</subject><subject>Shape</subject><subject>Shape control</subject><subject>Singularities</subject><subject>State estimation</subject><subject>Static models</subject><subject>Steel</subject><subject>Strip steel</subject><subject>Strips</subject><subject>System effectiveness</subject><issn>1063-6536</issn><issn>1558-0865</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2023</creationdate><recordtype>article</recordtype><sourceid>RIE</sourceid><recordid>eNo9kE9LAzEQxRdRsFY_gHgJeN6aSTa7m2Nd_ActQlvxGNLdWZuyTWqyPeinN0uLzGGG4ffeDC9JboFOAKh8WFXL1YRRxiacsUJQdpaMQIgypWUuzuNMc57mgueXyVUIW0ohE6wYJWqBxrbO17hD25MZam-N_SKfpt-QuWuwSx91wIZMQzCh17ZGEnGy3Og9ksrZ3ruOGEuWaJtfszOeLFzXDRZz03XhOrlodRfw5tTHycfz06p6TWfvL2_VdJbWTPI-bQoqy1aIogHOCxAFqylD1DI-SnMda51p1KIVGaOi4RABWENc0lJkRc3Hyf3Rd-_d9wFDr7bu4G08qVjJSpkBCIgUHKnauxA8tmrvzU77HwVUDTmqIUc15KhOOUbN3VFjEPGfl1JCNOV_zFlt1Q</recordid><startdate>202307</startdate><enddate>202307</enddate><creator>Park, Jonghyuk</creator><creator>Kim, Beomsu</creator><creator>Han, Soohee</creator><general>IEEE</general><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><scope>97E</scope><scope>RIA</scope><scope>RIE</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SP</scope><scope>7TB</scope><scope>8FD</scope><scope>FR3</scope><scope>L7M</scope><orcidid>https://orcid.org/0000-0003-0425-0958</orcidid><orcidid>https://orcid.org/0000-0002-9831-3499</orcidid></search><sort><creationdate>202307</creationdate><title>Reinforcement Learning With Model-Based Assistance for Shape Control in Sendzimir Rolling Mills</title><author>Park, Jonghyuk ; Kim, Beomsu ; Han, Soohee</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c293t-d7098f557d13371572c02eea900106a6a6b4aea5f54205d3172c1b1b4a08547c3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2023</creationdate><topic>Actor-critic policy gradient</topic><topic>Actuators</topic><topic>Cluster mills</topic><topic>cold rolling mill</topic><topic>Cold rolling mills</topic><topic>Cold tandem mills</topic><topic>Controllers</topic><topic>Deep learning</topic><topic>Heuristic algorithms</topic><topic>Metal strips</topic><topic>Observability</topic><topic>Optimization</topic><topic>partially observable Markov decision process (MDP)</topic><topic>reinforcement learning</topic><topic>Sendzimir rolling mill (ZRM)</topic><topic>Sendzimir Z mills</topic><topic>Shape</topic><topic>Shape control</topic><topic>Singularities</topic><topic>State estimation</topic><topic>Static models</topic><topic>Steel</topic><topic>Strip steel</topic><topic>Strips</topic><topic>System effectiveness</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Park, Jonghyuk</creatorcontrib><creatorcontrib>Kim, Beomsu</creatorcontrib><creatorcontrib>Han, Soohee</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Electronic Library (IEL)</collection><collection>CrossRef</collection><collection>Electronics & Communications Abstracts</collection><collection>Mechanical & Transportation Engineering Abstracts</collection><collection>Technology Research Database</collection><collection>Engineering Research Database</collection><collection>Advanced Technologies Database with Aerospace</collection><jtitle>IEEE transactions on control systems technology</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Park, Jonghyuk</au><au>Kim, Beomsu</au><au>Han, Soohee</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Reinforcement Learning With Model-Based Assistance for Shape Control in Sendzimir Rolling Mills</atitle><jtitle>IEEE transactions on control systems technology</jtitle><stitle>TCST</stitle><date>2023-07</date><risdate>2023</risdate><volume>31</volume><issue>4</issue><spage>1867</spage><epage>1874</epage><pages>1867-1874</pages><issn>1063-6536</issn><eissn>1558-0865</eissn><coden>IETTE2</coden><abstract>As one of the most popular tandem cold rolling mills, the Sendzimir rolling mill (ZRM) aims to obtain a flat steel strip shape by properly allocating the rolling pressure. To improve the performance of the ZRM, it is meaningful to adopt recently emerging deep reinforcement learning (DRL) that is powerful for difficult-to-solve and challenging problems. However, the direct application of DRL techniques may be impractical because of a serious singularity, partial observability, and even safety issues inherent in mill systems. In this brief, we propose an effective hybridization approach that integrates a model-based assistant into model-free DRL to resolve such practical issues. For the model-based assistant, a model-based optimization problem is first constructed and solved for the static part of the mill model. Then, the obtained static model-based coarse assistant, or controller, is improved by the proposed reinforcement learning, considering the remaining dynamic part of the mill model. The serious singularity can be resolved using the model-based approach, and the issue of partial observability is addressed by the long short-term memory (LSTM) state estimator in the proposed method. In simulation results, the proposed method successfully learns a highly performing policy for the ZRM, achieving a higher reward than pure model-free DRL. It is also observed that the proposed method can safely improve the shape controller of the mill system. The demonstration results strongly confirm the high applicability of DRL to other cold multiroll mills, such as four-high, six-high, and cluster mills.</abstract><cop>New York</cop><pub>IEEE</pub><doi>10.1109/TCST.2022.3227502</doi><tpages>8</tpages><orcidid>https://orcid.org/0000-0003-0425-0958</orcidid><orcidid>https://orcid.org/0000-0002-9831-3499</orcidid></addata></record> |
fulltext | fulltext_linktorsrc |
identifier | ISSN: 1063-6536 |
ispartof | IEEE transactions on control systems technology, 2023-07, Vol.31 (4), p.1867-1874 |
issn | 1063-6536 1558-0865 |
language | eng |
recordid | cdi_crossref_primary_10_1109_TCST_2022_3227502 |
source | IEEE Electronic Library (IEL) |
subjects | Actor-critic policy gradient Actuators Cluster mills cold rolling mill Cold rolling mills Cold tandem mills Controllers Deep learning Heuristic algorithms Metal strips Observability Optimization partially observable Markov decision process (MDP) reinforcement learning Sendzimir rolling mill (ZRM) Sendzimir Z mills Shape Shape control Singularities State estimation Static models Steel Strip steel Strips System effectiveness |
title | Reinforcement Learning With Model-Based Assistance for Shape Control in Sendzimir Rolling Mills |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-26T18%3A11%3A28IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_RIE&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Reinforcement%20Learning%20With%20Model-Based%20Assistance%20for%20Shape%20Control%20in%20Sendzimir%20Rolling%20Mills&rft.jtitle=IEEE%20transactions%20on%20control%20systems%20technology&rft.au=Park,%20Jonghyuk&rft.date=2023-07&rft.volume=31&rft.issue=4&rft.spage=1867&rft.epage=1874&rft.pages=1867-1874&rft.issn=1063-6536&rft.eissn=1558-0865&rft.coden=IETTE2&rft_id=info:doi/10.1109/TCST.2022.3227502&rft_dat=%3Cproquest_RIE%3E2828941151%3C/proquest_RIE%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2828941151&rft_id=info:pmid/&rft_ieee_id=9991941&rfr_iscdi=true |