PUMA
Istituto di Scienza e Tecnologie dell'Informazione     
Corsini P., Lopriore L., Strigini L. Fail-safeness in a multiprocessor system : a distributed strategy based on backward error recovery. Progetto finalizzato Informatica, MUMICRO, 1982. Document n. IEI-F82-12, 1982.
 
 
Abstract
(English)
A method for fault-handling is presented. designed for multiprocessor systems supporting concurrent processes cooperating through message exchange. The proposal is described in reference to a specific system. i. e., the MuTEAM prototype developed in Pisa: our requirements was that no erroneous output be generated by the system under a single fault hypothesis. The fault-handling model adopted is based on backward error recovery: the set of all the application processes is partitioned into disjoint subsets (called families), which represent the atomic unit of recovery. Recovery points are established on communications among families. A single consistent recovery line is maintained, thereby avoiding the domino effect. The model does not rely on the usage of mass storage devices: rather, the recovery information pertinent to all the processes is kept in the distributed main memory of the system.
Subject


Icona documento 1) Download Document PDF


Icona documento Open access Icona documento Restricted Icona documento Private

 


Per ulteriori informazioni, contattare: Librarian http://puma.isti.cnr.it

Valid HTML 4.0 Transitional