The study of the reliability of the hardware part of the office cluster

The study of measures of reliability of the hardware part of the office cluster was carried out on the example of the cluster SKIF-GEO-Office RB (further as “cluster”) developed within the framework of scientific and technical program "SKIF-NEDRA" (2015-2018, Program of the Union State of...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Informatika (Minsk, Belarus) Belarus), 2021-07, Vol.18 (2), p.48-57
Hauptverfasser: Martinovich, T. S., Paramonov, N. N., Rymarchuk, A. G., Tchij, O. P.
Format: Artikel
Sprache:eng ; rus
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:The study of measures of reliability of the hardware part of the office cluster was carried out on the example of the cluster SKIF-GEO-Office RB (further as “cluster”) developed within the framework of scientific and technical program "SKIF-NEDRA" (2015-2018, Program of the Union State of Russia and Belarus). The cluster components are located in a small rack on the basis of full Tower "Aerocool Expredator Black" type case. The basic architectural principles implemented in the cluster, the composition, structural and functional scheme of the cluster are given. The methodological support for calculating the reliability of the cluster, based on previous studies of the authors, and its structural scheme of reliability is justified. The choice of the main measures of reliability of the cluster core and the set of computing facilities is justified and formulas of calculation of these measures are given. The analysis of the consequences of failures of component parts of the cluster is carried out. A mathematical model of reliability (state graph) of the set of computing facilities of cluster is proposed, which allows to derive formulas for calculating the average value of the time-to-failure and time-to-interruption of cluster. The estimation of the reliability of the cluster as a whole, based on the calculation of measures of reliability on the reference data on the reliability of components as well as on the operation of supercomputers of the family SKIF. The measures of reliability of the cluster are calculated.
ISSN:1816-0301
2617-6963
DOI:10.37661/1816-0301-2021-18-2-48-57