Coordinated Optimization of Generation and Compensation to Enhance Short-Term Voltage Security of Power Systems Using Accelerated Multi-Objective Reinforcement Learning

High proportions of asynchronous motors in demand-side have pressured heavily on short-term voltage security of receiving-end power systems. To enhance short-term voltage security, this paper coordinates the optimal outputs of generation and compensation in a multi-objective dynamic optimization mod...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE access 2020, Vol.8, p.34770-34782
Hauptverfasser: Deng, Zhuoming, Lu, Zhilin, Guo, Zhifei, Yao, Wenfeng, Zhao, Wenmeng, Zhou, Baorong, Hong, Chao
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:High proportions of asynchronous motors in demand-side have pressured heavily on short-term voltage security of receiving-end power systems. To enhance short-term voltage security, this paper coordinates the optimal outputs of generation and compensation in a multi-objective dynamic optimization model. With equipment dynamics, network load flows, lower and upper limitations, and security constraints considered, this model simultaneously minimizes two objectives: the expense of control decision and the voltage deviation. The Radau collocation method is employed to handle dynamics, by transforming all differential algebraic equations into algebraic ones. Most importantly, Pareto solutions are obtained through an accelerated multi-objective reinforcement learning (AMORL) method by filtering the dominated solutions. The entire feasible region is partitioned into small independent regions, to eliminate the scope for Pareto solutions. Besides, the AMORL method redefines the state functions and introduces creative state sensitivities, which accelerate the switch from learning to applying, once the agent accumulates sufficient knowledge. Furthermore, Pareto solutions are diversified via introducing some potential solutions. Lastly, the Fuzzy decision-making methodology picks up the tradeoff solution. Case studies are implemented on a practical 748-node power grid, which validate the acceleration and efficiency of the AMORL method. The AMORL method is overall superior to conventional reinforcement learning (RL) method with more optimal non-dominated objective values, much shorter CPU time, and better convergence to accurate values. Moreover, compared with another three state-of-the-art RL methods, the AMORL method takes almost the same CPU time of several seconds, but is slightly superior to the state-of-the-art methods in terms of optimal objective values. Additionally, the calculated values of the AMORL method fit the best with the accurate values during each iteration, resulting in a good convergence.
ISSN:2169-3536
2169-3536
DOI:10.1109/ACCESS.2020.2974503