ADAPTIVE ERROR CORRECTION TO IMPROVE FOR SYSTEM MEMORY RELIABILITY, AVAILABILITY, AND SERVICEABILITY (RAS)

A memory subsystem includes memory devices with space dynamically allocated for improvement of reliability, availability, and serviceability (RAS) in the system. Error checking and correction (ECC) logic detects an error in all or a portion of a memory device. In response to error detection, the sys...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: HSING-MIN CHEN, KULJIT BAINS, SREENIVAS MANDAVA, JING LING, WEI P. CHEN, THEODROS YIGZAW, VAIBHAV SINGH, DEEP K. BUCH, ANDREW M. RUDOFF, RAJAT AGARWAL, JOHN G. HOLM, KJERSTEN E. CRISS, WEI WU
Format: Patent
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:A memory subsystem includes memory devices with space dynamically allocated for improvement of reliability, availability, and serviceability (RAS) in the system. Error checking and correction (ECC) logic detects an error in all or a portion of a memory device. In response to error detection, the system can dynamically perform one or more of: allocate active memory device space for sparing to spare a failed memory segment; write a poison pattern into a failed cacheline to mark it as failed; perform permanent fault detection (PFD) and adjust application of ECC based on PFD detection; or, spare only a portion of a device and leave another portion active, including adjusting ECC based on the spared portion. The error detection can be based on bits of an ECC device, and error correction based on those bits and additional bits stored on the data devices.