Date of Award:

5-1-2014

Document Type:

Dissertation

Degree Name:

Doctor of Philosophy (PhD)

Department:

Mathematics and Statistics

Advisor/Chair:

John R. Stevens

Abstract

The main aim of this dissertation is to meet real needs of practitioners in multiple hypothesis testing. The issue of multiplicity has become a signicant concern in most elds of research as computational abilities have increased, allowing for the simultaneous testing of many (thousands or millions) statistical hypothesis tests. While many error rates have been dened to address this issue of multiplicity, this work considers only the most natural generalization of the Type I Error rate to multiple tests, the family-wise error rate (FWER). Much work has already been done to establish powerful yet general methods which control the FWER under arbitrary dependencies among tests. This work both introduces these methods and expands upon them as is detailed through its four main chapters. Chapter 1 contains general introductions and preliminaries important to the remainder of the work, particularly a previously published graphical weighted Bonferroni multiplicity adjustment. Chapter 2 then applies the principles introduced in Chapter 1 to achieve a substantial computational improvement to an existing FWER controlling multiplicity approach (the Focus Level method) for gene set testing in high throughput microarray and next generation sequencing studies using Gene Ontology graphs. This improvement to the Focus Level procedure, which we call the Short Focus Level procedure, is achieved by extending the reach of graphical weighted Bonferroni testing to closed testing situations where restricted hypotheses are present. This is accomplished through Theorem 1 of Chapter 2. As a result of the improvement, the full top-down approach to the Focus Level procedure can now be performed, overcoming a signicant disadvantage of the otherwise powerful approach to multiple testing. Chapter 3 presents a solution to a multiple testing diculty within quantitative trait loci (QTL) mapping in natural populations for QTL LD (linkage disequilibrium) mapping models. Such models apply a two-hypothesis framework to the testing of thousands of genetic markers across the genome in search of QTL underlying a quantitative trait of interest. Inherent to the model is an unidentiability issue where a parameter of interest is identiable only under the alternative hypothesis. Through a second application of graphical weighted Bonferroni methods we show how the multiplicity can be accounted for while simultaneously accounting for the required logical structuring of the testing such that identiability is preserved. Finally, Chapter 4 details some of the diculties associated with the distributional assumptions for the test statistics of the two hypotheses of the LDbased QTL mapping framework. A novel bivariate testing strategy is proposed for these test statistics in order to overcome these distributional diculties while preserving power in the multiplicity correction by reducing the number of tests performed. Chapter 5 concludes the work with a summary of the main contributions and future research goals aimed at continual improvement to the multiple testing issues inherent to both the elds of genetics and genomics.

Share

COinS