Proposal of an Adaptive Fault Tolerance Mechanism to Tolerate Intermittent Faults in RAM

Due to transistor shrinking, intermittent faults are a major concern in current digital systems. This work presents an adaptive fault tolerance mechanism based on error correction codes (ECC), able to modify its behavior when the error conditions change without increasing the redundancy. As a case e...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Electronics (Basel) 2020-12, Vol.9 (12), p.2074
Hauptverfasser: Baraza-Calvo, J.-Carlos, Gracia-Morán, Joaquín, Saiz-Adalid, Luis-J., Gil-Tomás, Daniel, Gil-Vicente, Pedro-J.
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page
container_issue 12
container_start_page 2074
container_title Electronics (Basel)
container_volume 9
creator Baraza-Calvo, J.-Carlos
Gracia-Morán, Joaquín
Saiz-Adalid, Luis-J.
Gil-Tomás, Daniel
Gil-Vicente, Pedro-J.
description Due to transistor shrinking, intermittent faults are a major concern in current digital systems. This work presents an adaptive fault tolerance mechanism based on error correction codes (ECC), able to modify its behavior when the error conditions change without increasing the redundancy. As a case example, we have designed a mechanism that can detect intermittent faults and swap from an initial generic ECC to a specific ECC capable of tolerating one intermittent fault. We have inserted the mechanism in the memory system of a 32-bit RISC processor and validated it by using VHDL simulation-based fault injection. We have used two (39, 32) codes: a single error correction–double error detection (SEC–DED) and a code developed by our research group, called EPB3932, capable of correcting single errors and double and triple adjacent errors that include a bit previously tagged as error-prone. The results of injecting transient, intermittent, and combinations of intermittent and transient faults show that the proposed mechanism works properly. As an example, the percentage of failures and latent errors is 0% when injecting a triple adjacent fault after an intermittent stuck-at fault. We have synthesized the adaptive fault tolerance mechanism proposed in two types of FPGAs: non-reconfigurable and partially reconfigurable. In both cases, the overhead introduced is affordable in terms of hardware, time and power consumption.
doi_str_mv 10.3390/electronics9122074
format Article
fullrecord <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_journals_2468802723</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2468802723</sourcerecordid><originalsourceid>FETCH-LOGICAL-c319t-b6d52065dcec1eb27243cb6a4c3f490f32954fdcecf0e29a464ccbea10c9f89d3</originalsourceid><addsrcrecordid>eNplUE1Lw0AUXETBUvsHPC14ju5X0-yxFKuFFkUqeAubl7eYkuzG3a3gvzclPQi-yzxm5s2DIeSWs3spNXvAFiEF7xqImgvBFuqCTAbQmRZaXP7Zr8ksxgMbRnNZSDYhH6_B9z6alnpLjaPL2vSp-Ua6Nsc20b1vMRgHSHcIn8Y1saPJn-mEdOMShq5JCV0aTyJtHH1b7m7IlTVtxNkZp-R9_bhfPWfbl6fNarnNQHKdsiqv54Ll8xoQOFZiIZSEKjcKpFWaWSn0XNmTahkKbVSuACo0nIG2ha7llNyNuX3wX0eMqTz4Y3DDy1KovCjYECkHlxhdEHyMAW3Zh6Yz4afkrDyVWP4vUf4CMP9o6A</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2468802723</pqid></control><display><type>article</type><title>Proposal of an Adaptive Fault Tolerance Mechanism to Tolerate Intermittent Faults in RAM</title><source>MDPI - Multidisciplinary Digital Publishing Institute</source><source>Elektronische Zeitschriftenbibliothek - Frei zugängliche E-Journals</source><creator>Baraza-Calvo, J.-Carlos ; Gracia-Morán, Joaquín ; Saiz-Adalid, Luis-J. ; Gil-Tomás, Daniel ; Gil-Vicente, Pedro-J.</creator><creatorcontrib>Baraza-Calvo, J.-Carlos ; Gracia-Morán, Joaquín ; Saiz-Adalid, Luis-J. ; Gil-Tomás, Daniel ; Gil-Vicente, Pedro-J.</creatorcontrib><description>Due to transistor shrinking, intermittent faults are a major concern in current digital systems. This work presents an adaptive fault tolerance mechanism based on error correction codes (ECC), able to modify its behavior when the error conditions change without increasing the redundancy. As a case example, we have designed a mechanism that can detect intermittent faults and swap from an initial generic ECC to a specific ECC capable of tolerating one intermittent fault. We have inserted the mechanism in the memory system of a 32-bit RISC processor and validated it by using VHDL simulation-based fault injection. We have used two (39, 32) codes: a single error correction–double error detection (SEC–DED) and a code developed by our research group, called EPB3932, capable of correcting single errors and double and triple adjacent errors that include a bit previously tagged as error-prone. The results of injecting transient, intermittent, and combinations of intermittent and transient faults show that the proposed mechanism works properly. As an example, the percentage of failures and latent errors is 0% when injecting a triple adjacent fault after an intermittent stuck-at fault. We have synthesized the adaptive fault tolerance mechanism proposed in two types of FPGAs: non-reconfigurable and partially reconfigurable. In both cases, the overhead introduced is affordable in terms of hardware, time and power consumption.</description><identifier>ISSN: 2079-9292</identifier><identifier>EISSN: 2079-9292</identifier><identifier>DOI: 10.3390/electronics9122074</identifier><language>eng</language><publisher>Basel: MDPI AG</publisher><subject>Adaptive systems ; Design ; Digital systems ; Error correction ; Error correction &amp; detection ; Error detection ; Fault detection ; Fault tolerance ; Manufacturing ; Microprocessors ; Power consumption ; Random access memory ; Reconfiguration ; Redundancy ; RISC ; Transistors</subject><ispartof>Electronics (Basel), 2020-12, Vol.9 (12), p.2074</ispartof><rights>2020. This work is licensed under http://creativecommons.org/licenses/by/3.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c319t-b6d52065dcec1eb27243cb6a4c3f490f32954fdcecf0e29a464ccbea10c9f89d3</citedby><cites>FETCH-LOGICAL-c319t-b6d52065dcec1eb27243cb6a4c3f490f32954fdcecf0e29a464ccbea10c9f89d3</cites><orcidid>0000-0001-9225-1998 ; 0000-0001-9715-8960 ; 0000-0002-4868-2050 ; 0000-0001-7692-2309</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>314,776,780,27901,27902</link.rule.ids></links><search><creatorcontrib>Baraza-Calvo, J.-Carlos</creatorcontrib><creatorcontrib>Gracia-Morán, Joaquín</creatorcontrib><creatorcontrib>Saiz-Adalid, Luis-J.</creatorcontrib><creatorcontrib>Gil-Tomás, Daniel</creatorcontrib><creatorcontrib>Gil-Vicente, Pedro-J.</creatorcontrib><title>Proposal of an Adaptive Fault Tolerance Mechanism to Tolerate Intermittent Faults in RAM</title><title>Electronics (Basel)</title><description>Due to transistor shrinking, intermittent faults are a major concern in current digital systems. This work presents an adaptive fault tolerance mechanism based on error correction codes (ECC), able to modify its behavior when the error conditions change without increasing the redundancy. As a case example, we have designed a mechanism that can detect intermittent faults and swap from an initial generic ECC to a specific ECC capable of tolerating one intermittent fault. We have inserted the mechanism in the memory system of a 32-bit RISC processor and validated it by using VHDL simulation-based fault injection. We have used two (39, 32) codes: a single error correction–double error detection (SEC–DED) and a code developed by our research group, called EPB3932, capable of correcting single errors and double and triple adjacent errors that include a bit previously tagged as error-prone. The results of injecting transient, intermittent, and combinations of intermittent and transient faults show that the proposed mechanism works properly. As an example, the percentage of failures and latent errors is 0% when injecting a triple adjacent fault after an intermittent stuck-at fault. We have synthesized the adaptive fault tolerance mechanism proposed in two types of FPGAs: non-reconfigurable and partially reconfigurable. In both cases, the overhead introduced is affordable in terms of hardware, time and power consumption.</description><subject>Adaptive systems</subject><subject>Design</subject><subject>Digital systems</subject><subject>Error correction</subject><subject>Error correction &amp; detection</subject><subject>Error detection</subject><subject>Fault detection</subject><subject>Fault tolerance</subject><subject>Manufacturing</subject><subject>Microprocessors</subject><subject>Power consumption</subject><subject>Random access memory</subject><subject>Reconfiguration</subject><subject>Redundancy</subject><subject>RISC</subject><subject>Transistors</subject><issn>2079-9292</issn><issn>2079-9292</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2020</creationdate><recordtype>article</recordtype><sourceid>BENPR</sourceid><recordid>eNplUE1Lw0AUXETBUvsHPC14ju5X0-yxFKuFFkUqeAubl7eYkuzG3a3gvzclPQi-yzxm5s2DIeSWs3spNXvAFiEF7xqImgvBFuqCTAbQmRZaXP7Zr8ksxgMbRnNZSDYhH6_B9z6alnpLjaPL2vSp-Ua6Nsc20b1vMRgHSHcIn8Y1saPJn-mEdOMShq5JCV0aTyJtHH1b7m7IlTVtxNkZp-R9_bhfPWfbl6fNarnNQHKdsiqv54Ll8xoQOFZiIZSEKjcKpFWaWSn0XNmTahkKbVSuACo0nIG2ha7llNyNuX3wX0eMqTz4Y3DDy1KovCjYECkHlxhdEHyMAW3Zh6Yz4afkrDyVWP4vUf4CMP9o6A</recordid><startdate>20201201</startdate><enddate>20201201</enddate><creator>Baraza-Calvo, J.-Carlos</creator><creator>Gracia-Morán, Joaquín</creator><creator>Saiz-Adalid, Luis-J.</creator><creator>Gil-Tomás, Daniel</creator><creator>Gil-Vicente, Pedro-J.</creator><general>MDPI AG</general><scope>AAYXX</scope><scope>CITATION</scope><scope>7SP</scope><scope>8FD</scope><scope>8FE</scope><scope>8FG</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>ARAPS</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>HCIFZ</scope><scope>L7M</scope><scope>P5Z</scope><scope>P62</scope><scope>PHGZM</scope><scope>PHGZT</scope><scope>PIMPY</scope><scope>PKEHL</scope><scope>PQEST</scope><scope>PQGLB</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope><orcidid>https://orcid.org/0000-0001-9225-1998</orcidid><orcidid>https://orcid.org/0000-0001-9715-8960</orcidid><orcidid>https://orcid.org/0000-0002-4868-2050</orcidid><orcidid>https://orcid.org/0000-0001-7692-2309</orcidid></search><sort><creationdate>20201201</creationdate><title>Proposal of an Adaptive Fault Tolerance Mechanism to Tolerate Intermittent Faults in RAM</title><author>Baraza-Calvo, J.-Carlos ; Gracia-Morán, Joaquín ; Saiz-Adalid, Luis-J. ; Gil-Tomás, Daniel ; Gil-Vicente, Pedro-J.</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c319t-b6d52065dcec1eb27243cb6a4c3f490f32954fdcecf0e29a464ccbea10c9f89d3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2020</creationdate><topic>Adaptive systems</topic><topic>Design</topic><topic>Digital systems</topic><topic>Error correction</topic><topic>Error correction &amp; detection</topic><topic>Error detection</topic><topic>Fault detection</topic><topic>Fault tolerance</topic><topic>Manufacturing</topic><topic>Microprocessors</topic><topic>Power consumption</topic><topic>Random access memory</topic><topic>Reconfiguration</topic><topic>Redundancy</topic><topic>RISC</topic><topic>Transistors</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Baraza-Calvo, J.-Carlos</creatorcontrib><creatorcontrib>Gracia-Morán, Joaquín</creatorcontrib><creatorcontrib>Saiz-Adalid, Luis-J.</creatorcontrib><creatorcontrib>Gil-Tomás, Daniel</creatorcontrib><creatorcontrib>Gil-Vicente, Pedro-J.</creatorcontrib><collection>CrossRef</collection><collection>Electronics &amp; Communications Abstracts</collection><collection>Technology Research Database</collection><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>ProQuest Central (Alumni Edition)</collection><collection>ProQuest Central UK/Ireland</collection><collection>Advanced Technologies &amp; Aerospace Collection</collection><collection>ProQuest Central Essentials</collection><collection>ProQuest Central</collection><collection>Technology Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central Korea</collection><collection>SciTech Premium Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Advanced Technologies &amp; Aerospace Database</collection><collection>ProQuest Advanced Technologies &amp; Aerospace Collection</collection><collection>ProQuest Central (New)</collection><collection>ProQuest One Academic (New)</collection><collection>Publicly Available Content Database</collection><collection>ProQuest One Academic Middle East (New)</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Applied &amp; Life Sciences</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><jtitle>Electronics (Basel)</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Baraza-Calvo, J.-Carlos</au><au>Gracia-Morán, Joaquín</au><au>Saiz-Adalid, Luis-J.</au><au>Gil-Tomás, Daniel</au><au>Gil-Vicente, Pedro-J.</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Proposal of an Adaptive Fault Tolerance Mechanism to Tolerate Intermittent Faults in RAM</atitle><jtitle>Electronics (Basel)</jtitle><date>2020-12-01</date><risdate>2020</risdate><volume>9</volume><issue>12</issue><spage>2074</spage><pages>2074-</pages><issn>2079-9292</issn><eissn>2079-9292</eissn><abstract>Due to transistor shrinking, intermittent faults are a major concern in current digital systems. This work presents an adaptive fault tolerance mechanism based on error correction codes (ECC), able to modify its behavior when the error conditions change without increasing the redundancy. As a case example, we have designed a mechanism that can detect intermittent faults and swap from an initial generic ECC to a specific ECC capable of tolerating one intermittent fault. We have inserted the mechanism in the memory system of a 32-bit RISC processor and validated it by using VHDL simulation-based fault injection. We have used two (39, 32) codes: a single error correction–double error detection (SEC–DED) and a code developed by our research group, called EPB3932, capable of correcting single errors and double and triple adjacent errors that include a bit previously tagged as error-prone. The results of injecting transient, intermittent, and combinations of intermittent and transient faults show that the proposed mechanism works properly. As an example, the percentage of failures and latent errors is 0% when injecting a triple adjacent fault after an intermittent stuck-at fault. We have synthesized the adaptive fault tolerance mechanism proposed in two types of FPGAs: non-reconfigurable and partially reconfigurable. In both cases, the overhead introduced is affordable in terms of hardware, time and power consumption.</abstract><cop>Basel</cop><pub>MDPI AG</pub><doi>10.3390/electronics9122074</doi><orcidid>https://orcid.org/0000-0001-9225-1998</orcidid><orcidid>https://orcid.org/0000-0001-9715-8960</orcidid><orcidid>https://orcid.org/0000-0002-4868-2050</orcidid><orcidid>https://orcid.org/0000-0001-7692-2309</orcidid><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier ISSN: 2079-9292
ispartof Electronics (Basel), 2020-12, Vol.9 (12), p.2074
issn 2079-9292
2079-9292
language eng
recordid cdi_proquest_journals_2468802723
source MDPI - Multidisciplinary Digital Publishing Institute; Elektronische Zeitschriftenbibliothek - Frei zugängliche E-Journals
subjects Adaptive systems
Design
Digital systems
Error correction
Error correction & detection
Error detection
Fault detection
Fault tolerance
Manufacturing
Microprocessors
Power consumption
Random access memory
Reconfiguration
Redundancy
RISC
Transistors
title Proposal of an Adaptive Fault Tolerance Mechanism to Tolerate Intermittent Faults in RAM
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-15T18%3A45%3A24IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Proposal%20of%20an%20Adaptive%20Fault%20Tolerance%20Mechanism%20to%20Tolerate%20Intermittent%20Faults%20in%20RAM&rft.jtitle=Electronics%20(Basel)&rft.au=Baraza-Calvo,%20J.-Carlos&rft.date=2020-12-01&rft.volume=9&rft.issue=12&rft.spage=2074&rft.pages=2074-&rft.issn=2079-9292&rft.eissn=2079-9292&rft_id=info:doi/10.3390/electronics9122074&rft_dat=%3Cproquest_cross%3E2468802723%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2468802723&rft_id=info:pmid/&rfr_iscdi=true