A Remaining Useful Life Prediction Method of Rolling Bearings Based on Deep Reinforcement Learning

Remaining useful life (RUL) prediction technology is a crucial task in prognostics and health management (PHM) systems, as it contributes to the enhancement of the reliability of equipment operation. With the development of Industrial Internet of Things (IIoT) technologies, it becomes possible to ef...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE internet of things journal 2024-07, Vol.11 (13), p.22938-22949
Hauptverfasser: Zheng, Guokang, Li, Yasong, Zhou, Zheng, Yan, Ruqiang
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Remaining useful life (RUL) prediction technology is a crucial task in prognostics and health management (PHM) systems, as it contributes to the enhancement of the reliability of equipment operation. With the development of Industrial Internet of Things (IIoT) technologies, it becomes possible to efficiently coordinate data collection for mechanical equipment, enabling real-time monitoring of device status and performance. This could provide more accurate estimations of the RUL. Although current RUL prediction techniques predominantly rely on deep learning (DL), these approaches often neglect the temporal correlation within training samples, resulting in unstable prediction outcomes. To address this issue, a novel RUL prediction method is introduced, leveraging deep reinforcement learning (DRL). This method combines the effective feature extraction ability of DL with the preservation of temporal correlation between samples through reinforcement learning. First, an autoencoder (AE) is employed to extract key features that are most relevant to degenerative process from the original signals collected from mechanical equipment. Second, state variables in reinforcement learning are constructed using the extracted features and the predicted RUL value of the sample at the previous time step. Finally, a DRL model based on the twin delayed deep deterministic policy gradient (DDPG) algorithm (TD3) is trained after setting an appropriate action space and reward function. Validation using XJTU-SY bearing data set demonstrates that the DRL method yields lesser root mean square error (RMSE) and more stable prediction results compared to alternative methods.
ISSN:2327-4662
2327-4662
DOI:10.1109/JIOT.2024.3363610