Date of Award:

5-2013

Document Type:

Thesis

Degree Name:

Master of Science (MS)

Department:

Electrical and Computer Engineering

Committee Chair(s)

Koushik Chakraborty

Committee

Koushik Chakraborty

Committee

Dan Watson

Committee

Chris Winstead

Abstract

Recent increases in hard fault rates in modern chip multi-processors have led to a variety of approaches to try and save manufacturing yield. Among these are: fine-grain fault tolerance (such as error correction coding, redundant cache lines, and redundant functional units), and large-grain fault tolerance (such as disabling of faulty cores, adding extra cores, and core salvaging techniques). This paper considers the case of core salvaging techniques and the heterogeneous performance introduced when these techniques have some salvaged and some non-faulty cores. It proposes a hypervisor-based hardware thread scheduler, triggered by detection of spin locks and thread imbalance, that mitigates the loss of throughput resulting from this het- erogeneity. Specifically, a new algorithm, called Most ProgressMade algorithm, reduces the number of synchronization locks held on a salvaged core and balances the time each thread in an application spends running on that core. For some benchmarks, the results show as much as a 2.68x increase in performance over a salvaged chip multi-processor without this technique.

Checksum

822515dcda179947c88609493b51e9a0

Share

COinS