Towards fine tuning wake steering policies in the field: an imitation-based approach

Yaw misalignment strategies can increase the power output of wind farms by mitigating wake effects, but finding optimal yaws requires overcoming both modeling errors and the growing complexity of the problem as the size of the farm grows. Recent works have therefore proposed decentralized multi-agen...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Journal of physics. Conference series 2024-06, Vol.2767 (3), p.32017
Hauptverfasser:	Bizon Monroc, C, Bušić, A, Dubuc, D, Zhu, J
Format:	Artikel
Sprache:	eng
Schlagworte:	Algorithms Cognitive tasks Computational Physics Distance learning Lookup tables Machine learning Misalignment Multiagent systems Multilayers Neural networks Optimization Physics Policies Simulators Steady state models Steering Supervised learning Wind farms Wind power Yaw
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page
container_issue	3
container_start_page	32017
container_title	Journal of physics. Conference series
container_volume	2767
creator	Bizon Monroc, C Bušić, A Dubuc, D Zhu, J
description	Yaw misalignment strategies can increase the power output of wind farms by mitigating wake effects, but finding optimal yaws requires overcoming both modeling errors and the growing complexity of the problem as the size of the farm grows. Recent works have therefore proposed decentralized multi-agent reinforcement learning (MARL) as a model-free, data-based alternative to learn online. These solutions have led to significant increases in total power production on experiments with both static and dynamic wind farms simulators. Yet experiments in dynamic simulations suggest that convergence time remains too long for online learning on real wind farms. As an improvement, baseline policies obtained by optimizing offline through steady-state models can be fed as inputs to an online reinforcement learning algorithm. This method however does not guarantee a smooth transfer of the policies to the real wind farm. This is aggravated when using function approximation approaches such as multi-layer neural networks to estimate policies and value functions. We propose an imitation approach, where learning a policy is first considered a supervised learning problem by deriving references from steady-state wind farm models, and then as an online reinforcement learning task for adaptation in the field. This approach leads to significant increases in the amount of energy produced over a lookup table (LUT) baseline on experiments done with the mid-fidelity dynamic simulator FAST.Farm under both static and varying wind conditions.
doi_str_mv	10.1088/1742-6596/2767/3/032017
format	Article
fullrecord	<record><control><sourceid>proquest_iop_j</sourceid><recordid>TN_cdi_proquest_journals_3066470874</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>3066470874</sourcerecordid><originalsourceid>FETCH-LOGICAL-c393t-e03804efded79e50b3376e35c78257f18c1157e10ec3c1b6c2a2846091d2e7363</originalsourceid><addsrcrecordid>eNqFkFFLwzAUhYMoOKe_wYBPCrVJ0yapbzLUKQMF53PI0luX2TW16Rz-e1MqE0EwL8m9-e65h4PQKSWXlEgZU5EmEc9yHieCi5jFhCWEij002v3s795SHqIj71eEsHDECM3nbqvbwuPS1oC7TW3rV7zVb4B9B9D2VeMqayx4bGvcLSGQUBVXWNfYrm2nO-vqaKE9FFg3Teu0WR6jg1JXHk6-7zF6ub2ZT6bR7PHufnI9iwzLWRcBYZKkUBZQiBwysgiOOLDMCJlkoqTSUJoJoAQMM3TBTaITmXKS0yIBwTgbo_NBd6kr1bR2rdtP5bRV0-uZ6nsk5RkjMv-ggT0b2GDxfQO-Uyu3aetgTzHCeSqIFGmgxECZ1nnfQrmTpUT1cas-SNWHqvq4FVND3GHyYpi0rvmRfniaPP8GVVOUAWZ_wP-t-AKui42-</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>3066470874</pqid></control><display><type>article</type><title>Towards fine tuning wake steering policies in the field: an imitation-based approach</title><source>IOP Publishing Free Content</source><source>Elektronische Zeitschriftenbibliothek - Frei zugängliche E-Journals</source><source>IOPscience extra</source><source>Alma/SFX Local Collection</source><source>Free Full-Text Journals in Chemistry</source><creator>Bizon Monroc, C ; Bušić, A ; Dubuc, D ; Zhu, J</creator><creatorcontrib>Bizon Monroc, C ; Bušić, A ; Dubuc, D ; Zhu, J</creatorcontrib><description>Yaw misalignment strategies can increase the power output of wind farms by mitigating wake effects, but finding optimal yaws requires overcoming both modeling errors and the growing complexity of the problem as the size of the farm grows. Recent works have therefore proposed decentralized multi-agent reinforcement learning (MARL) as a model-free, data-based alternative to learn online. These solutions have led to significant increases in total power production on experiments with both static and dynamic wind farms simulators. Yet experiments in dynamic simulations suggest that convergence time remains too long for online learning on real wind farms. As an improvement, baseline policies obtained by optimizing offline through steady-state models can be fed as inputs to an online reinforcement learning algorithm. This method however does not guarantee a smooth transfer of the policies to the real wind farm. This is aggravated when using function approximation approaches such as multi-layer neural networks to estimate policies and value functions. We propose an imitation approach, where learning a policy is first considered a supervised learning problem by deriving references from steady-state wind farm models, and then as an online reinforcement learning task for adaptation in the field. This approach leads to significant increases in the amount of energy produced over a lookup table (LUT) baseline on experiments done with the mid-fidelity dynamic simulator FAST.Farm under both static and varying wind conditions.</description><identifier>ISSN: 1742-6588</identifier><identifier>EISSN: 1742-6596</identifier><identifier>DOI: 10.1088/1742-6596/2767/3/032017</identifier><language>eng</language><publisher>Bristol: IOP Publishing</publisher><subject>Algorithms ; Cognitive tasks ; Computational Physics ; Distance learning ; Lookup tables ; Machine learning ; Misalignment ; Multiagent systems ; Multilayers ; Neural networks ; Optimization ; Physics ; Policies ; Simulators ; Steady state models ; Steering ; Supervised learning ; Wind farms ; Wind power ; Yaw</subject><ispartof>Journal of physics. Conference series, 2024-06, Vol.2767 (3), p.32017</ispartof><rights>Published under licence by IOP Publishing Ltd</rights><rights>Published under licence by IOP Publishing Ltd. This work is published under https://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.</rights><rights>Attribution</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><cites>FETCH-LOGICAL-c393t-e03804efded79e50b3376e35c78257f18c1157e10ec3c1b6c2a2846091d2e7363</cites><orcidid>0000-0002-4133-3739</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://iopscience.iop.org/article/10.1088/1742-6596/2767/3/032017/pdf$$EPDF$$P50$$Giop$$Hfree_for_read</linktopdf><link.rule.ids>230,314,776,780,881,27901,27902,38845,38867,53815,53842</link.rule.ids><backlink>$$Uhttps://ifp.hal.science/hal-04653089$$DView record in HAL$$Hfree_for_read</backlink></links><search><creatorcontrib>Bizon Monroc, C</creatorcontrib><creatorcontrib>Bušić, A</creatorcontrib><creatorcontrib>Dubuc, D</creatorcontrib><creatorcontrib>Zhu, J</creatorcontrib><title>Towards fine tuning wake steering policies in the field: an imitation-based approach</title><title>Journal of physics. Conference series</title><addtitle>J. Phys.: Conf. Ser</addtitle><description>Yaw misalignment strategies can increase the power output of wind farms by mitigating wake effects, but finding optimal yaws requires overcoming both modeling errors and the growing complexity of the problem as the size of the farm grows. Recent works have therefore proposed decentralized multi-agent reinforcement learning (MARL) as a model-free, data-based alternative to learn online. These solutions have led to significant increases in total power production on experiments with both static and dynamic wind farms simulators. Yet experiments in dynamic simulations suggest that convergence time remains too long for online learning on real wind farms. As an improvement, baseline policies obtained by optimizing offline through steady-state models can be fed as inputs to an online reinforcement learning algorithm. This method however does not guarantee a smooth transfer of the policies to the real wind farm. This is aggravated when using function approximation approaches such as multi-layer neural networks to estimate policies and value functions. We propose an imitation approach, where learning a policy is first considered a supervised learning problem by deriving references from steady-state wind farm models, and then as an online reinforcement learning task for adaptation in the field. This approach leads to significant increases in the amount of energy produced over a lookup table (LUT) baseline on experiments done with the mid-fidelity dynamic simulator FAST.Farm under both static and varying wind conditions.</description><subject>Algorithms</subject><subject>Cognitive tasks</subject><subject>Computational Physics</subject><subject>Distance learning</subject><subject>Lookup tables</subject><subject>Machine learning</subject><subject>Misalignment</subject><subject>Multiagent systems</subject><subject>Multilayers</subject><subject>Neural networks</subject><subject>Optimization</subject><subject>Physics</subject><subject>Policies</subject><subject>Simulators</subject><subject>Steady state models</subject><subject>Steering</subject><subject>Supervised learning</subject><subject>Wind farms</subject><subject>Wind power</subject><subject>Yaw</subject><issn>1742-6588</issn><issn>1742-6596</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><sourceid>O3W</sourceid><sourceid>BENPR</sourceid><recordid>eNqFkFFLwzAUhYMoOKe_wYBPCrVJ0yapbzLUKQMF53PI0luX2TW16Rz-e1MqE0EwL8m9-e65h4PQKSWXlEgZU5EmEc9yHieCi5jFhCWEij002v3s795SHqIj71eEsHDECM3nbqvbwuPS1oC7TW3rV7zVb4B9B9D2VeMqayx4bGvcLSGQUBVXWNfYrm2nO-vqaKE9FFg3Teu0WR6jg1JXHk6-7zF6ub2ZT6bR7PHufnI9iwzLWRcBYZKkUBZQiBwysgiOOLDMCJlkoqTSUJoJoAQMM3TBTaITmXKS0yIBwTgbo_NBd6kr1bR2rdtP5bRV0-uZ6nsk5RkjMv-ggT0b2GDxfQO-Uyu3aetgTzHCeSqIFGmgxECZ1nnfQrmTpUT1cas-SNWHqvq4FVND3GHyYpi0rvmRfniaPP8GVVOUAWZ_wP-t-AKui42-</recordid><startdate>20240601</startdate><enddate>20240601</enddate><creator>Bizon Monroc, C</creator><creator>Bušić, A</creator><creator>Dubuc, D</creator><creator>Zhu, J</creator><general>IOP Publishing</general><general>IOP Science</general><scope>O3W</scope><scope>TSCCA</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>8FD</scope><scope>8FE</scope><scope>8FG</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>ARAPS</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>H8D</scope><scope>HCIFZ</scope><scope>L7M</scope><scope>P5Z</scope><scope>P62</scope><scope>PIMPY</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope><scope>1XC</scope><scope>VOOES</scope><orcidid>https://orcid.org/0000-0002-4133-3739</orcidid></search><sort><creationdate>20240601</creationdate><title>Towards fine tuning wake steering policies in the field: an imitation-based approach</title><author>Bizon Monroc, C ; Bušić, A ; Dubuc, D ; Zhu, J</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c393t-e03804efded79e50b3376e35c78257f18c1157e10ec3c1b6c2a2846091d2e7363</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Algorithms</topic><topic>Cognitive tasks</topic><topic>Computational Physics</topic><topic>Distance learning</topic><topic>Lookup tables</topic><topic>Machine learning</topic><topic>Misalignment</topic><topic>Multiagent systems</topic><topic>Multilayers</topic><topic>Neural networks</topic><topic>Optimization</topic><topic>Physics</topic><topic>Policies</topic><topic>Simulators</topic><topic>Steady state models</topic><topic>Steering</topic><topic>Supervised learning</topic><topic>Wind farms</topic><topic>Wind power</topic><topic>Yaw</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Bizon Monroc, C</creatorcontrib><creatorcontrib>Bušić, A</creatorcontrib><creatorcontrib>Dubuc, D</creatorcontrib><creatorcontrib>Zhu, J</creatorcontrib><collection>IOP Publishing Free Content</collection><collection>IOPscience (Open Access)</collection><collection>CrossRef</collection><collection>Technology Research Database</collection><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>ProQuest Central (Alumni Edition)</collection><collection>ProQuest Central UK/Ireland</collection><collection>Advanced Technologies & Aerospace Collection</collection><collection>ProQuest Central Essentials</collection><collection>ProQuest Central</collection><collection>Technology Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central Korea</collection><collection>Aerospace Database</collection><collection>SciTech Premium Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Advanced Technologies & Aerospace Database</collection><collection>ProQuest Advanced Technologies & Aerospace Collection</collection><collection>Publicly Available Content Database</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><collection>Hyper Article en Ligne (HAL)</collection><collection>Hyper Article en Ligne (HAL) (Open Access)</collection><jtitle>Journal of physics. Conference series</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Bizon Monroc, C</au><au>Bušić, A</au><au>Dubuc, D</au><au>Zhu, J</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Towards fine tuning wake steering policies in the field: an imitation-based approach</atitle><jtitle>Journal of physics. Conference series</jtitle><addtitle>J. Phys.: Conf. Ser</addtitle><date>2024-06-01</date><risdate>2024</risdate><volume>2767</volume><issue>3</issue><spage>32017</spage><pages>32017-</pages><issn>1742-6588</issn><eissn>1742-6596</eissn><abstract>Yaw misalignment strategies can increase the power output of wind farms by mitigating wake effects, but finding optimal yaws requires overcoming both modeling errors and the growing complexity of the problem as the size of the farm grows. Recent works have therefore proposed decentralized multi-agent reinforcement learning (MARL) as a model-free, data-based alternative to learn online. These solutions have led to significant increases in total power production on experiments with both static and dynamic wind farms simulators. Yet experiments in dynamic simulations suggest that convergence time remains too long for online learning on real wind farms. As an improvement, baseline policies obtained by optimizing offline through steady-state models can be fed as inputs to an online reinforcement learning algorithm. This method however does not guarantee a smooth transfer of the policies to the real wind farm. This is aggravated when using function approximation approaches such as multi-layer neural networks to estimate policies and value functions. We propose an imitation approach, where learning a policy is first considered a supervised learning problem by deriving references from steady-state wind farm models, and then as an online reinforcement learning task for adaptation in the field. This approach leads to significant increases in the amount of energy produced over a lookup table (LUT) baseline on experiments done with the mid-fidelity dynamic simulator FAST.Farm under both static and varying wind conditions.</abstract><cop>Bristol</cop><pub>IOP Publishing</pub><doi>10.1088/1742-6596/2767/3/032017</doi><tpages>10</tpages><orcidid>https://orcid.org/0000-0002-4133-3739</orcidid><oa>free_for_read</oa></addata></record>
fulltext	fulltext
identifier	ISSN: 1742-6588
ispartof	Journal of physics. Conference series, 2024-06, Vol.2767 (3), p.32017
issn	1742-6588 1742-6596
language	eng
recordid	cdi_proquest_journals_3066470874
source	IOP Publishing Free Content; Elektronische Zeitschriftenbibliothek - Frei zugängliche E-Journals; IOPscience extra; Alma/SFX Local Collection; Free Full-Text Journals in Chemistry
subjects	Algorithms Cognitive tasks Computational Physics Distance learning Lookup tables Machine learning Misalignment Multiagent systems Multilayers Neural networks Optimization Physics Policies Simulators Steady state models Steering Supervised learning Wind farms Wind power Yaw
title	Towards fine tuning wake steering policies in the field: an imitation-based approach
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-09T20%3A55%3A44IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_iop_j&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Towards%20fine%20tuning%20wake%20steering%20policies%20in%20the%20field:%20an%20imitation-based%20approach&rft.jtitle=Journal%20of%20physics.%20Conference%20series&rft.au=Bizon%20Monroc,%20C&rft.date=2024-06-01&rft.volume=2767&rft.issue=3&rft.spage=32017&rft.pages=32017-&rft.issn=1742-6588&rft.eissn=1742-6596&rft_id=info:doi/10.1088/1742-6596/2767/3/032017&rft_dat=%3Cproquest_iop_j%3E3066470874%3C/proquest_iop_j%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=3066470874&rft_id=info:pmid/&rfr_iscdi=true