Document Type
Article
Journal/Book Title/Conference
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Volume
28
Issue
7
Publisher
Institute of Electrical and Electronics Engineers
Publication Date
7-1-2020
Funder
National Science Foundation
First Page
1557
Last Page
1566
Abstract
The emergence of hardware accelerators has brought about several orders of magnitude improvement in the speed of the deep neural-network (DNN) inference. Among such DNN accelerators, the Google tensor processing unit (TPU) has transpired to be the best-in-class, offering more than 15\times speedup over the contemporary GPUs. However, the rapid growth in several DNN workloads conspires to escalate the energy consumptions of the TPU-based data-centers. In order to restrict the energy consumption of TPUs, we propose GreenTPU - a low-power near-threshold (NTC) TPU design paradigm. To ensure a high inference accuracy at a low-voltage operation, GreenTPU identifies the patterns in the error-causing activation sequences in the systolic array, and prevents further timing errors from similar patterns by intermittently boosting the operating voltage of the specific multiplier-and-accumulator units in the TPU. Compared to a cutting-edge timing error mitigation technique for TPUs, GreenTPU enables 2\times to 3\times higher performance (TOPS) in an NTC TPU, with a minimal loss in the prediction accuracy.
Recommended Citation
Pramesh Pandey, Prabal Basu, Koushik Chakraborty and Sanghamitra Roy, GreenTPU: Predictive Design Paradigm for Improving Timing Error Resilience of a Near-Threshold Tensor Processing Unit, IEEE Transactions on Very Large Scale Integration Systems (TVLSI), Volume 28, Issue 7, pp. 557-566, July 2020.
Comments
© 2020 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.