Rigorous development of an embedded fault-tolerant system based on coordinated atomic actions

Describes our experience using coordinated atomic (CA) actions as a system structuring tool to design and validate a sophisticated and embedded control system for a complex industrial application that has high reliability and safety requirements. Our study is based on an extended production cell mod...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE transactions on computers 2002-02, Vol.51 (2), p.164-179
Hauptverfasser: Xu, J., Randell, B., Romanovsky, A., Stroud, R.J., Zorzo, A.F., Canver, E., von Henke, F.
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Describes our experience using coordinated atomic (CA) actions as a system structuring tool to design and validate a sophisticated and embedded control system for a complex industrial application that has high reliability and safety requirements. Our study is based on an extended production cell model, the specification and simulator for which were defined and developed by FZI (Forschungszentrum Informatik, Germany). This "fault-tolerant production cell" represents a manufacturing process involving redundant mechanical devices (provided in order to enable continued production in the presence of machine faults). The challenge posed by the model specification is to design a control system that maintains specified safety and liveness properties even in the presence of a large number and variety of device and sensor failures. Based on an analysis of such failures, we provide details of: (1) a design for a control program that uses CA actions to deal with both safety-related and fault tolerance concerns and (2) the formal verification of this design based on the use of model checking. We found that CA action structuring facilitated both the design and verification tasks by enabling the various safety problems (involving possible clashes of moving machinery) to be treated independently. Even complex situations involving the concurrent occurrence of any pairs of the many possible mechanical and sensor failures can be handled simply yet appropriately. The formal verification activity was performed in parallel with the design activity, and the interaction between them resulted in a combined exercise in "design for validation"; formal verification was very valuable in identifying some very subtle residual bugs in early versions of our design which would have been difficult to detect otherwise.
ISSN:0018-9340
1557-9956
DOI:10.1109/12.980006