Experimental control of the flow separation behind a backward facing step using deep reinforcement learning

In this experimental study, a value-based reinforcement learning algorithm (deep Q-network, DQN) is used to control the flow separation behind a backward facing step at a Reynolds number of 2.9 × 104. The flow is forced by a dielectric barrier discharge (DBD) plasma actuator pasted at the upstream o...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Physics of fluids (1994) 2024-10, Vol.36 (10)
Hauptverfasser:	Xiang, Jiawei, Zong, Haohua, Wu, Yun, Li, Jinping, Liang, Hua
Format:	Artikel
Sprache:	eng
Schlagworte:	Actuation Algorithms Artificial neural networks Backward facing steps Control theory Deep learning Dielectric barrier discharge Error analysis Field programmable gate arrays Flow separation Fluid dynamics Fluid flow Kinetic energy Machine learning Optimal control Real time Reynolds number Sensor arrays Shear layers Spatial distribution
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	In this experimental study, a value-based reinforcement learning algorithm (deep Q-network, DQN) is used to control the flow separation behind a backward facing step at a Reynolds number of 2.9 × 104. The flow is forced by a dielectric barrier discharge (DBD) plasma actuator pasted at the upstream of the step edge, and the feedback information of the separation zone is provided by a hotwire sensor submerged in the downstream shear layer. The control law represented by a deep neural network is implemented on a field programable gate array (FPGA), able to execute in real-time at a frequency as high as 1000 Hz. Results show that both open-loop periodical control and DQN control can effectively reduce the reattachment length and the recirculation area. Compared with the former, which requires dozens of trail-and-error measurements lasting for hours, the latter is able to find an optimal control law in only two minutes, achieving a long-term reward 7% higher. Moreover, by introducing a weak penalty term for plasma actuation, the mean actuator power consumption in DQN can be cut down to only 60% of that in the optimal open-loop control, meanwhile sacrificing a negligible amount of control effectiveness. Physically, the open-loop periodical control destabilizes the shear layer earlier, increasing both the area and the peak amplitude of the high turbulent kinetic energy (TKE) zone, whereas under DQN control, only a slight increase in the TKE peak is observed, and the overall spatial distribution remains the same as baseline.
ISSN:	1070-6631 1089-7666
DOI:	10.1063/5.0231459