NASA Logo - Jet Propulsion Laboratory To the JPL Home page To the NASA Home page To the Caltech Home page    + View the NASA Portal
Skip Navigation
JPL Home Earth Solar System Stars & Galaxies Technology
Space Technology 8

Dependable Multiprocessor:  1  |   2  |  3  |  4

EAFTC icon Technologies

Dependable Multiprocessor

>> The Dependable Multiprocessor Solution

The Dependable Multiprocessor technology to be validated by Space Technology 8 is composed of a state-of-the-art COTS-based supercomputer architecture and advanced-technology fault-tolerant software. The radiation-hardened processor selects the system’s operating mode – simplex, duplex, or triplex – and the fault-tolerant middleware, implemented through calls by the science-application program being processed, detects faults using voting or algorithm based fault tolerance and, when a fault is detected, in conjunction with the application code, implements the appropriate action: rollback to a previously defined checkpoint; application restart; processor elimination, or one of the other responses available.

This results in highly intelligent, lower cost, smaller, and more capable spacecraft.

A Dependable Multiprocessor consists of two key technology enablers:

  1. A computer architecture that uses COTS processing components and a radiation hardened controller that supports adaptable configuration, providing multiple levels of fault tolerance.

  2. Fault-tolerant middleware (software) that can be used by applications programmers, in conjunction with their own application codes, to invoke appropriate hardware-based and algorithm-based fault tolerance.

Dependable Multiprocessor technology facilitates the reliable use of COTS components where soft errors are likely, while avoiding the unwarranted drawbacks of traditional hardware replication techniques.

EAFTC flight experiment concept
Validation experiment concept for the Dependable Multiprocessor.

>> The Dependable Multiprocessor Validation Experiment

The Dependable Multiprocessor validation experiment will demonstrate the technological maturity of a COTS-based computer architecture and its fault-tolerant software. Together, these aspects of the Dependable Multiprocessor will allow space scientists to perform on-board scientific processing with confidence. The test article for this validation experiments consists of:

  • Three COTS-based PPC-based single board computers, that can be operated in simplex, duplex, or triplex modes, and which can be configured to operate as a parallel processing cluster.
  • A COTS-based mass memory

  • Fault-tolerance middle-ware, which provides an underlying layer of fault tolerance support services for the Dependable Multiprocessor.

  • A separate, radiation-hardened computer that will serve as the controller for the COTS computer cluster.

The test hardware and software will be validated using a series of tests to verify that the above delineated Dependable Multiprocessor configuration, under stressing conditions of both simulated and radiation induced faults, and executing realistically stressing scientific software programs, achieves the performance levels and characteristics predicted by the Dependable Multiprocessor technology-models (i.e., radiation effects models, fault models, error models, performance models, and fault tolerance models).

Sensitive COTS components will be tested while exposed to high-energy proton and heavy ion radiation. The data from these tests will tell the engineers the kinds of faults and their relative frequency each component will exhibit. Using software fault-injection tools, the testbed system will be exposed to these faults at a higher than “real life” rates to observe the behavior and performance of the fault-tolerant software while running actual and synthetic application programs.

The statistical performance predicted by these models for the testbed system will be compared to the performance observed when the Dependable Multiprocessor’s principal building blocks are system tested while exposed to proton and heavy-ion radiation.

This comparison will validate the models. Once validated, the models will be used to predict the performance of the Dependable Multiprocessor when running realistic applications in realistic radiation environments associated with Earth-orbiting and planetary missions.

>> A New Era for On-Board Science Data Processing

With the successful validation of the Dependable Multiprocessor, a door will have been opened for implementing on-board scientific computing using high-performance computers and fault-tolerant software. The specific design of the Dependable Multiprocessor itself can be expanded from the three COTS processors validated by ST8 to a computer having 32 COTS processing nodes using the same hardware architecture, radiation-hardened controller, and the same fault-tolerant software used by the ST8 experiment. Such a computer would provide in-space performance of at least 300 MOPS/W.

Addition of the fault tolerant FPGA co-processor technology from Honeywell/University of Florida would provide a performance increase to on the order of 3000 MOPS/Watt for many science processing codes.

The ST8 Dependable Multiprocessor validation experiment will have shown that the computing performance available on-board spacecraft for science data processing for a given amount of power can be increased by several orders of magnitude. Ultimately this result will prove even more important than the specific hardware approach validated by the ST8 Dependable Multiprocessor experiment and the specific fault-tolerant software used to mitigate the sensitivity of the Commercial Off-the-Shelf components it used.

The Dependable Multiprocessor is being developed by Honeywell Aerospace-Clearwater located in Clearwater, FL. The Principal Investigator is John Samson.

Go to next topic

News ArchiveGlossarySite MapImages and Copyright InfoCredits and Contacts
FIRST GOV   NASA Home Page

Webmaster: Diane K. Fisher
JPL Official: Nancy J. Leon
Last updated: 3/4/08
JPL Clearance #: 06-1093

Go to NMP Home page