Exploring Warp Criticality in Near-Threshold GPGPU Applications using a Dynamic Choke Point Analysis
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Institute of Electrical and Electronics Engineers
National Science Foundation
General-purpose graphics processing units (GPGPUs), due to their enormous parallelism, have found ubiquitous applications in parallel computing. However, their peak power rating has also increased over the years. As a consequence, near-threshold computing (NTC) has come to the rescue. However, a severe device-level delay variability arising from process variation (PV) can significantly diminish the NTC system performance. In this article, we examine choke points - a unique device-level characteristic of PV at NTC - that can exacerbate the delays of the GPGPU parallel warps. In order to improve the NTC GPU performance, we propose a family of holistic circuit-architectural solutions, referred to as choke-point-aware warp speculator (CPAWS). CPAWS identifies the choke point-induced critical warps in GPGPU applications and improves their execution latencies. Compared to a state-of-the-art warp scheduling policy, our best scheme improves the performance and energy efficiency of an NTC GPU by 39% and 31%, respectively.
Sourav Sanyal, Prabal Basu, Aatreyi Bal, Sanghamitra Roy and Koushik Chakraborty, Exploring Warp Criticality in Near-Threshold GPGPU Applications using a Dynamic Choke Point Analysis, IEEE Transactions on Very Large Scale Integration Systems (TVLSI), Volume 28, Issue 2, pp. 456-466, February 2020.