Active flow control using deep reinforcement learning with time delays in Markov decision process and autoregressive policy

Classical active flow control (AFC) methods based on solving the Navier–Stokes equations are laborious and computationally intensive even with the use of reduced-order models. Data-driven methods offer a promising alternative for AFC, and they have been applied successfully to reduce the drag of two...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Physics of fluids (1994) 2022-05, Vol.34 (5)
Hauptverfasser: Zhong, Shan, Yin, Hujun
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page
container_issue 5
container_start_page
container_title Physics of fluids (1994)
container_volume 34
creator Zhong, Shan
Yin, Hujun
description Classical active flow control (AFC) methods based on solving the Navier–Stokes equations are laborious and computationally intensive even with the use of reduced-order models. Data-driven methods offer a promising alternative for AFC, and they have been applied successfully to reduce the drag of two-dimensional bluff bodies, such as a circular cylinder, using deep reinforcement-learning (DRL) paradigms. However, due to the onset of weak turbulence in the wake, the standard DRL method tends to result in large fluctuations in the unsteady forces acting on the cylinder as the Reynolds number increases. In this study, a Markov decision process (MDP) with time delays is introduced to model and quantify the action delays in the environment in a DRL process due to the time difference between control actuation and flow response along with the use of a first-order autoregressive policy (ARP). This hybrid DRL method is applied to control the vortex-shedding process from a two-dimensional circular cylinder using four synthetic jet actuators at a freestream Reynolds number of 400. This method has yielded a stable and coherent control, which results in a steadier and more elongated vortex formation zone behind the cylinder, hence, a much weaker vortex-shedding process and less fluctuating lift and drag forces. Compared to the standard DRL method, this method utilizes the historical samples without additional sampling in training, and it is capable of reducing the magnitude of drag and lift fluctuations by approximately 90% while achieving a similar level of drag reduction in the deterministic control at the same actuation frequency. This study demonstrates the necessity of including a physics-informed delay and regressive nature in the MDP and the benefits of introducing ARPs to achieve a robust and temporal-coherent control of unsteady forces in active flow control.
doi_str_mv 10.1063/5.0086871
format Article
fullrecord <record><control><sourceid>proquest_scita</sourceid><recordid>TN_cdi_scitation_primary_10_1063_5_0086871</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2660168354</sourcerecordid><originalsourceid>FETCH-LOGICAL-c362t-90dcbd6b879425f3ff1d19bba2b03c61fd08199b74b0ba5d377ba35236df6a273</originalsourceid><addsrcrecordid>eNqdkMtKAzEUhoMoWKsL3yDgSmFqMukkM8tSvEHFja6HXGvqNBmTTEvx5U1pwb2rc_s4_zk_ANcYTTCi5L6aIFTTmuETMMKobgpGKT3d5wwVlBJ8Di5iXCGESFPSEfiZyWQ3GprOb6H0LgXfwSFat4RK6x4GbZ3xQeq1dgl2mge3n21t-oTJrnWmOr6L0Dr4ysOX3-SGtNF6B_vgpY4RcqcgH5IPehlyvVfrfWfl7hKcGd5FfXWMY_Dx-PA-fy4Wb08v89mikISWqWiQkkJRUbNmWlaGGIMVboTgpUBEUmwUqnHTCDYVSPBKEcYEJ1VJqDKUl4yMwc1hb77oe9AxtSs_BJcl25JShGlNqmmmbg-UDD7GoE3bB7vmYddi1O69bav26G1m7w5slDbxlL_9H7zx4Q9se2XIL17DimI</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2660168354</pqid></control><display><type>article</type><title>Active flow control using deep reinforcement learning with time delays in Markov decision process and autoregressive policy</title><source>AIP Journals Complete</source><source>Alma/SFX Local Collection</source><creator>Zhong, Shan ; Yin, Hujun</creator><creatorcontrib>Zhong, Shan ; Yin, Hujun</creatorcontrib><description>Classical active flow control (AFC) methods based on solving the Navier–Stokes equations are laborious and computationally intensive even with the use of reduced-order models. Data-driven methods offer a promising alternative for AFC, and they have been applied successfully to reduce the drag of two-dimensional bluff bodies, such as a circular cylinder, using deep reinforcement-learning (DRL) paradigms. However, due to the onset of weak turbulence in the wake, the standard DRL method tends to result in large fluctuations in the unsteady forces acting on the cylinder as the Reynolds number increases. In this study, a Markov decision process (MDP) with time delays is introduced to model and quantify the action delays in the environment in a DRL process due to the time difference between control actuation and flow response along with the use of a first-order autoregressive policy (ARP). This hybrid DRL method is applied to control the vortex-shedding process from a two-dimensional circular cylinder using four synthetic jet actuators at a freestream Reynolds number of 400. This method has yielded a stable and coherent control, which results in a steadier and more elongated vortex formation zone behind the cylinder, hence, a much weaker vortex-shedding process and less fluctuating lift and drag forces. Compared to the standard DRL method, this method utilizes the historical samples without additional sampling in training, and it is capable of reducing the magnitude of drag and lift fluctuations by approximately 90% while achieving a similar level of drag reduction in the deterministic control at the same actuation frequency. This study demonstrates the necessity of including a physics-informed delay and regressive nature in the MDP and the benefits of introducing ARPs to achieve a robust and temporal-coherent control of unsteady forces in active flow control.</description><identifier>ISSN: 1070-6631</identifier><identifier>EISSN: 1089-7666</identifier><identifier>DOI: 10.1063/5.0086871</identifier><identifier>CODEN: PHFLE6</identifier><language>eng</language><publisher>Melville: American Institute of Physics</publisher><subject>Active control ; Actuation ; Actuators ; Autoregressive processes ; Bluff bodies ; Circular cylinders ; Deep learning ; Drag ; Drag reduction ; Flow control ; Fluid dynamics ; Fluid flow ; Lift ; Markov analysis ; Markov processes ; Physics ; Reduced order models ; Reynolds number ; Robust control ; Shedding ; Synthetic jets ; Two dimensional bodies ; Vortices</subject><ispartof>Physics of fluids (1994), 2022-05, Vol.34 (5)</ispartof><rights>Author(s)</rights><rights>2022 Author(s). Published under an exclusive license by AIP Publishing.</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c362t-90dcbd6b879425f3ff1d19bba2b03c61fd08199b74b0ba5d377ba35236df6a273</citedby><cites>FETCH-LOGICAL-c362t-90dcbd6b879425f3ff1d19bba2b03c61fd08199b74b0ba5d377ba35236df6a273</cites><orcidid>0000-0002-2373-1450 ; 0000-0001-5224-5497 ; 0000-0002-9198-5401</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>314,780,784,794,4512,27924,27925</link.rule.ids></links><search><creatorcontrib>Zhong, Shan</creatorcontrib><creatorcontrib>Yin, Hujun</creatorcontrib><title>Active flow control using deep reinforcement learning with time delays in Markov decision process and autoregressive policy</title><title>Physics of fluids (1994)</title><description>Classical active flow control (AFC) methods based on solving the Navier–Stokes equations are laborious and computationally intensive even with the use of reduced-order models. Data-driven methods offer a promising alternative for AFC, and they have been applied successfully to reduce the drag of two-dimensional bluff bodies, such as a circular cylinder, using deep reinforcement-learning (DRL) paradigms. However, due to the onset of weak turbulence in the wake, the standard DRL method tends to result in large fluctuations in the unsteady forces acting on the cylinder as the Reynolds number increases. In this study, a Markov decision process (MDP) with time delays is introduced to model and quantify the action delays in the environment in a DRL process due to the time difference between control actuation and flow response along with the use of a first-order autoregressive policy (ARP). This hybrid DRL method is applied to control the vortex-shedding process from a two-dimensional circular cylinder using four synthetic jet actuators at a freestream Reynolds number of 400. This method has yielded a stable and coherent control, which results in a steadier and more elongated vortex formation zone behind the cylinder, hence, a much weaker vortex-shedding process and less fluctuating lift and drag forces. Compared to the standard DRL method, this method utilizes the historical samples without additional sampling in training, and it is capable of reducing the magnitude of drag and lift fluctuations by approximately 90% while achieving a similar level of drag reduction in the deterministic control at the same actuation frequency. This study demonstrates the necessity of including a physics-informed delay and regressive nature in the MDP and the benefits of introducing ARPs to achieve a robust and temporal-coherent control of unsteady forces in active flow control.</description><subject>Active control</subject><subject>Actuation</subject><subject>Actuators</subject><subject>Autoregressive processes</subject><subject>Bluff bodies</subject><subject>Circular cylinders</subject><subject>Deep learning</subject><subject>Drag</subject><subject>Drag reduction</subject><subject>Flow control</subject><subject>Fluid dynamics</subject><subject>Fluid flow</subject><subject>Lift</subject><subject>Markov analysis</subject><subject>Markov processes</subject><subject>Physics</subject><subject>Reduced order models</subject><subject>Reynolds number</subject><subject>Robust control</subject><subject>Shedding</subject><subject>Synthetic jets</subject><subject>Two dimensional bodies</subject><subject>Vortices</subject><issn>1070-6631</issn><issn>1089-7666</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2022</creationdate><recordtype>article</recordtype><recordid>eNqdkMtKAzEUhoMoWKsL3yDgSmFqMukkM8tSvEHFja6HXGvqNBmTTEvx5U1pwb2rc_s4_zk_ANcYTTCi5L6aIFTTmuETMMKobgpGKT3d5wwVlBJ8Di5iXCGESFPSEfiZyWQ3GprOb6H0LgXfwSFat4RK6x4GbZ3xQeq1dgl2mge3n21t-oTJrnWmOr6L0Dr4ysOX3-SGtNF6B_vgpY4RcqcgH5IPehlyvVfrfWfl7hKcGd5FfXWMY_Dx-PA-fy4Wb08v89mikISWqWiQkkJRUbNmWlaGGIMVboTgpUBEUmwUqnHTCDYVSPBKEcYEJ1VJqDKUl4yMwc1hb77oe9AxtSs_BJcl25JShGlNqmmmbg-UDD7GoE3bB7vmYddi1O69bav26G1m7w5slDbxlL_9H7zx4Q9se2XIL17DimI</recordid><startdate>202205</startdate><enddate>202205</enddate><creator>Zhong, Shan</creator><creator>Yin, Hujun</creator><general>American Institute of Physics</general><scope>AAYXX</scope><scope>CITATION</scope><scope>8FD</scope><scope>H8D</scope><scope>L7M</scope><orcidid>https://orcid.org/0000-0002-2373-1450</orcidid><orcidid>https://orcid.org/0000-0001-5224-5497</orcidid><orcidid>https://orcid.org/0000-0002-9198-5401</orcidid></search><sort><creationdate>202205</creationdate><title>Active flow control using deep reinforcement learning with time delays in Markov decision process and autoregressive policy</title><author>Zhong, Shan ; Yin, Hujun</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c362t-90dcbd6b879425f3ff1d19bba2b03c61fd08199b74b0ba5d377ba35236df6a273</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2022</creationdate><topic>Active control</topic><topic>Actuation</topic><topic>Actuators</topic><topic>Autoregressive processes</topic><topic>Bluff bodies</topic><topic>Circular cylinders</topic><topic>Deep learning</topic><topic>Drag</topic><topic>Drag reduction</topic><topic>Flow control</topic><topic>Fluid dynamics</topic><topic>Fluid flow</topic><topic>Lift</topic><topic>Markov analysis</topic><topic>Markov processes</topic><topic>Physics</topic><topic>Reduced order models</topic><topic>Reynolds number</topic><topic>Robust control</topic><topic>Shedding</topic><topic>Synthetic jets</topic><topic>Two dimensional bodies</topic><topic>Vortices</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Zhong, Shan</creatorcontrib><creatorcontrib>Yin, Hujun</creatorcontrib><collection>CrossRef</collection><collection>Technology Research Database</collection><collection>Aerospace Database</collection><collection>Advanced Technologies Database with Aerospace</collection><jtitle>Physics of fluids (1994)</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Zhong, Shan</au><au>Yin, Hujun</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Active flow control using deep reinforcement learning with time delays in Markov decision process and autoregressive policy</atitle><jtitle>Physics of fluids (1994)</jtitle><date>2022-05</date><risdate>2022</risdate><volume>34</volume><issue>5</issue><issn>1070-6631</issn><eissn>1089-7666</eissn><coden>PHFLE6</coden><abstract>Classical active flow control (AFC) methods based on solving the Navier–Stokes equations are laborious and computationally intensive even with the use of reduced-order models. Data-driven methods offer a promising alternative for AFC, and they have been applied successfully to reduce the drag of two-dimensional bluff bodies, such as a circular cylinder, using deep reinforcement-learning (DRL) paradigms. However, due to the onset of weak turbulence in the wake, the standard DRL method tends to result in large fluctuations in the unsteady forces acting on the cylinder as the Reynolds number increases. In this study, a Markov decision process (MDP) with time delays is introduced to model and quantify the action delays in the environment in a DRL process due to the time difference between control actuation and flow response along with the use of a first-order autoregressive policy (ARP). This hybrid DRL method is applied to control the vortex-shedding process from a two-dimensional circular cylinder using four synthetic jet actuators at a freestream Reynolds number of 400. This method has yielded a stable and coherent control, which results in a steadier and more elongated vortex formation zone behind the cylinder, hence, a much weaker vortex-shedding process and less fluctuating lift and drag forces. Compared to the standard DRL method, this method utilizes the historical samples without additional sampling in training, and it is capable of reducing the magnitude of drag and lift fluctuations by approximately 90% while achieving a similar level of drag reduction in the deterministic control at the same actuation frequency. This study demonstrates the necessity of including a physics-informed delay and regressive nature in the MDP and the benefits of introducing ARPs to achieve a robust and temporal-coherent control of unsteady forces in active flow control.</abstract><cop>Melville</cop><pub>American Institute of Physics</pub><doi>10.1063/5.0086871</doi><tpages>15</tpages><orcidid>https://orcid.org/0000-0002-2373-1450</orcidid><orcidid>https://orcid.org/0000-0001-5224-5497</orcidid><orcidid>https://orcid.org/0000-0002-9198-5401</orcidid><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier ISSN: 1070-6631
ispartof Physics of fluids (1994), 2022-05, Vol.34 (5)
issn 1070-6631
1089-7666
language eng
recordid cdi_scitation_primary_10_1063_5_0086871
source AIP Journals Complete; Alma/SFX Local Collection
subjects Active control
Actuation
Actuators
Autoregressive processes
Bluff bodies
Circular cylinders
Deep learning
Drag
Drag reduction
Flow control
Fluid dynamics
Fluid flow
Lift
Markov analysis
Markov processes
Physics
Reduced order models
Reynolds number
Robust control
Shedding
Synthetic jets
Two dimensional bodies
Vortices
title Active flow control using deep reinforcement learning with time delays in Markov decision process and autoregressive policy
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-27T21%3A00%3A38IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_scita&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Active%20flow%20control%20using%20deep%20reinforcement%20learning%20with%20time%20delays%20in%20Markov%20decision%20process%20and%20autoregressive%20policy&rft.jtitle=Physics%20of%20fluids%20(1994)&rft.au=Zhong,%20Shan&rft.date=2022-05&rft.volume=34&rft.issue=5&rft.issn=1070-6631&rft.eissn=1089-7666&rft.coden=PHFLE6&rft_id=info:doi/10.1063/5.0086871&rft_dat=%3Cproquest_scita%3E2660168354%3C/proquest_scita%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2660168354&rft_id=info:pmid/&rfr_iscdi=true