Start Date

5-2020 12:00 AM

Description

Radiation found in terrestrial and space environments can induce errors into SRAM-based FPGAs. Replication of circuitry can be used mask and detect these errors to improve reliability or availability. This work advances the understanding and implementation of partial circuit replication in SRAM-based FPGAs. Partial circuit replication is the replication of a subset of the components in a circuit. A reliability model is presented that evaluates the reliability benefit of partial circuit replication. The model suggests that the reliability benefit is inversely related to the portion of the circuit replicated. A partial triple module redundancy case study is also presented that evaluates several different selection algorithms. Random selection was found to be ineffective and maximizing protected routes while minimizing inserted voters provided a high return, reducing failure likelihood by 20% with only 9% coverage. A final study applied duplication with compare to an FPGA-based networking system to detect persistent silent network disruptions. A coverage of 29% was able to detect 45% of these failures in neutron radiation testing.

Comments

Due to COVID-19, the Symposium was not able to be held this year. However, papers and posters were still submitted.

Available for download on Saturday, May 01, 2021

Share

COinS
 
May 1st, 12:00 AM

Masking and Detecting Radiation-Induced Errors in SRAM-Based FPGAs Through Partial Circuit Replication

Radiation found in terrestrial and space environments can induce errors into SRAM-based FPGAs. Replication of circuitry can be used mask and detect these errors to improve reliability or availability. This work advances the understanding and implementation of partial circuit replication in SRAM-based FPGAs. Partial circuit replication is the replication of a subset of the components in a circuit. A reliability model is presented that evaluates the reliability benefit of partial circuit replication. The model suggests that the reliability benefit is inversely related to the portion of the circuit replicated. A partial triple module redundancy case study is also presented that evaluates several different selection algorithms. Random selection was found to be ineffective and maximizing protected routes while minimizing inserted voters provided a high return, reducing failure likelihood by 20% with only 9% coverage. A final study applied duplication with compare to an FPGA-based networking system to detect persistent silent network disruptions. A coverage of 29% was able to detect 45% of these failures in neutron radiation testing.