BlockHammer: Improving Flash Reliability by Exploiting Process Variation Aware Proactive Failure Prediction

nand flash-based storage devices have gained a lot of popularity in recent years. Unfortunately, flash blocks suffer from limited endurance. For guaranteeing flash reliability, flash manufactures also prescribe a specified number of program and erase (P/E) cycles to define the endurance of flash blo...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE transactions on computer-aided design of integrated circuits and systems 2020-12, Vol.39 (12), p.4563-4574
Hauptverfasser: Ma, Ruixiang, Wu, Fei, Lu, Zhonghai, Zhong, Wenmin, Wu, Qiulin, Wan, Jiguang, Xie, Changsheng
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:nand flash-based storage devices have gained a lot of popularity in recent years. Unfortunately, flash blocks suffer from limited endurance. For guaranteeing flash reliability, flash manufactures also prescribe a specified number of program and erase (P/E) cycles to define the endurance of flash blocks within the same chip. To extend the service lifetime of a flash-based device, existing works also assume that flash blocks have the same endurance and take P/E-based wear-leveling algorithms which evenly distribute P/E cycle across flash blocks in the controller. However, many studies indicate flash blocks exhibit a wide endurance difference due to the fabrication process. The endurance of flash blocks is limited by the weakest block. Thus, the traditional P/E-based block retirement mechanism makes flash blocks underutilized. To best excavate the endurance of all blocks and improve the reliability of flash devices, we present BlockHammer, a process variation aware proactive failure prediction scheme. BlockHammer takes process variation and blocks similarity into consideration, it consists of a block classifier and a block lifetime predictor. Using machine learning technology, we first establish a block classifier to classify flash blocks into different classes. Based on the classification results, we then establish the block lifetime prediction model for different classes. Flash blocks belonging to the same class are assigned the same model. To verify the effectiveness of BlockHammer, we collect block data from a real nand flash-based testing platform by emulating the true application scenario of nand flash. We compare the predicted value and the tested value, the experimental results show the proposed proactive failure scheme can achieve more than 92% accuracy for flash blocks. Therefore, the block failure point can be accurately predicted using BlockHammer in advance, which greatly enhance the reliability of nand flash.
ISSN:0278-0070
1937-4151
1937-4151
DOI:10.1109/TCAD.2020.2981025