Deep reinforcement learning towards real-world dynamic thermal management of data centers
•Algorithm can be sensitive to algorithm setting, affecting optimality and robustness.•The discrepancy between the digitized and initial objectives cannot be ignored.•The system dynamics that can affect potential performance improvement is revealed.•Algorithms can obtain energy savings and violation...
Gespeichert in:
Veröffentlicht in: | Applied energy 2023-03, Vol.333, p.120561, Article 120561 |
---|---|
Hauptverfasser: | , , , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | •Algorithm can be sensitive to algorithm setting, affecting optimality and robustness.•The discrepancy between the digitized and initial objectives cannot be ignored.•The system dynamics that can affect potential performance improvement is revealed.•Algorithms can obtain energy savings and violation reductions in some scenarios.•Actor-critic, off-policy, and model-based algorithms exhibit better performance.
Deep Reinforcement Learning has been increasingly researched for Dynamic Thermal Management in Data Centers. However, existing works typically evaluate the performance of algorithms on a specific task, utilizing models or data trajectories without discussing in detail their implementation feasibility and their ability to deal with diverse work scenarios. The lack of these works limits the real-world deployment of Deep Reinforcement Learning. To this end, this paper comprehensively evaluates the strengths and limitations of state-of-the-art algorithms by conducting analytical and numerical studies. The analysis is conducted in four dimensions: algorithms, tasks, system dynamics, and knowledge transfer. As an inherent property, the sensitivity to algorithms settings is first evaluated in a simulated data center model. The ability to deal with various tasks and the sensitivity to reward functions are subsequently studied. The trade-off between constraints and power savings is identified by conducting ablation experiments. Next, the performance under different work scenarios is investigated, including various equipment, workload schedules, locations, and power densities. Finally, the transferability of algorithms across tasks and scenarios is also evaluated. The results show that actor-critic, off-policy, and model-based algorithms outperform others in optimality, robustness, and transferability. They can reduce violations and achieve around 8.84% power savings in some scenarios compared to the default controller. However, deploying these algorithms in real-world systems is challenging since they are sensitive to specific hyperparameters, reward functions, and work scenarios. Constraint violations and sample efficiency are some aspects that are still unsatisfactory. This paper presents our well-structured investigations, new findings, and challenges when deploying deep reinforcement learning in Data Centers. |
---|---|
ISSN: | 0306-2619 |
DOI: | 10.1016/j.apenergy.2022.120561 |