Experimental control of the flow separation behind a backward facing step using deep reinforcement learning

In this experimental study, a value-based reinforcement learning algorithm (deep Q-network, DQN) is used to control the flow separation behind a backward facing step at a Reynolds number of 2.9 × 104. The flow is forced by a dielectric barrier discharge (DBD) plasma actuator pasted at the upstream o...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Physics of fluids (1994) 2024-10, Vol.36 (10)
Hauptverfasser: Xiang, Jiawei, Zong, Haohua, Wu, Yun, Li, Jinping, Liang, Hua
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:In this experimental study, a value-based reinforcement learning algorithm (deep Q-network, DQN) is used to control the flow separation behind a backward facing step at a Reynolds number of 2.9 × 104. The flow is forced by a dielectric barrier discharge (DBD) plasma actuator pasted at the upstream of the step edge, and the feedback information of the separation zone is provided by a hotwire sensor submerged in the downstream shear layer. The control law represented by a deep neural network is implemented on a field programable gate array (FPGA), able to execute in real-time at a frequency as high as 1000 Hz. Results show that both open-loop periodical control and DQN control can effectively reduce the reattachment length and the recirculation area. Compared with the former, which requires dozens of trail-and-error measurements lasting for hours, the latter is able to find an optimal control law in only two minutes, achieving a long-term reward 7% higher. Moreover, by introducing a weak penalty term for plasma actuation, the mean actuator power consumption in DQN can be cut down to only 60% of that in the optimal open-loop control, meanwhile sacrificing a negligible amount of control effectiveness. Physically, the open-loop periodical control destabilizes the shear layer earlier, increasing both the area and the peak amplitude of the high turbulent kinetic energy (TKE) zone, whereas under DQN control, only a slight increase in the TKE peak is observed, and the overall spatial distribution remains the same as baseline.
ISSN:1070-6631
1089-7666
DOI:10.1063/5.0231459