Document Type

Conference Paper

Journal/Book Title/Conference

CODES '14: Proceedings of the 2014 International Conference on Hardware/Software Codesign and System Synthesis

Publisher

Association for Computing Machinery

Publication Date

10-12-2014

Funder

Division of Computer and Network Systems, National Science Foundation, Division of Computing and Communication Foundations

Abstract

Network-On-Chips (NoCs) have become the standard communication platform for future massively parallel systems due to their performance, flexibility and scalability advantages. However, reliability issues brought about by scaling in the sub-20nm era threaten to undermine the benefits offered by NoCs. In this paper, we showthat QoS policies exacerbate the reliability profile of an exascale system. To mitigate this imposing challenge, we propose Dynamic Wearout Resilient Routing (DWRR) algorithms in QoS-enabled exascale NoCs. Our proposal includes two novel DWRR algorithms enabled by a critical-path monitor and a broadcast-based routing configuration. Using PARSEC benchmarks, our best algorithm improves QoS and long-term sustainability (Mean Time To Failure) of the system by an average of 16% and 25% compared to a state-of-the-art fault tolerant technique, respectively. Copyright 2014 ACM.

Comments

© 2014. This is the author's version of the work. It is posted here for your personal use. Not for redistribution. The definitive Version of Record was published in CODES '14: Proceedings of the 2014 International Conference on Hardware/Software Codesign and System Synthesis, https://doi.org/10.1145/2656075.2656100.

Share

COinS