Tackling QoS-Induced Aging in Exascale Systems through Agile Path Selection

Document Type

Conference Paper

Journal/Book Title/Conference

CODES '14: Proceedings of the 2014 International Conference on Hardware/Software Codesign and System Synthesis


Association for Computing Machinery

Publication Date



Division of Computer and Network Systems, National Science Foundation, Division of Computing and Communication Foundations


Network-On-Chips (NoCs) have become the standard communication platform for future massively parallel systems due to their performance, flexibility and scalability advantages. However, reliability issues brought about by scaling in the sub-20nm era threaten to undermine the benefits offered by NoCs. In this paper, we showthat QoS policies exacerbate the reliability profile of an exascale system. To mitigate this imposing challenge, we propose Dynamic Wearout Resilient Routing (DWRR) algorithms in QoS-enabled exascale NoCs. Our proposal includes two novel DWRR algorithms enabled by a critical-path monitor and a broadcast-based routing configuration. Using PARSEC benchmarks, our best algorithm improves QoS and long-term sustainability (Mean Time To Failure) of the system by an average of 16% and 25% compared to a state-of-the-art fault tolerant technique, respectively. Copyright 2014 ACM.

This document is currently not available here.