Optimal strategies for scheduling checkpoints and preventive maintenance

At checkpoints during the operation of a computer, the state of the system is saved. Whenever a machine fails, it is repaired and then reset to the state saved at the latest checkpoint. In the present work, save times are known constants and repair times are random variables; failures are the epochs...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	IEEE transactions on reliability 1990-04, Vol.39 (1), p.9-18
Hauptverfasser:	Coffman, E.G., Gilbert, E.N.
Format:	Artikel
Sprache:	eng
Schlagworte:	Application software Applied sciences Checkpointing Computer applications Costs Electronics Exact sciences and technology Finishing Mathematical model Preventive maintenance Probability distribution Processor scheduling Random variables Testing, measurement, noise and reliability
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	At checkpoints during the operation of a computer, the state of the system is saved. Whenever a machine fails, it is repaired and then reset to the state saved at the latest checkpoint. In the present work, save times are known constants and repair times are random variables; failures are the epochs of a given renewal process. In scheduling the checkpoints, the cost of saves must be traded off against the cost of work lost when the computer fails. It is shown how to schedule checkpoints to minimize the mean total time to finish a given job. Similar optimization results are obtained for the tails of the distribution of the finishing time. Two variants of the basic model are considered. In one of the computer receives maintenance during each save; in the other it does not. Applications to the M/G/1 queuing system are touched on.< >
ISSN:	0018-9529 1558-1721
DOI:	10.1109/24.52636