Date of Award

5-2011

Degree Type

Report

Degree Name

Master of Science (MS)

Department

Mathematics and Statistics

Committee Chair(s)

John R. Stevens

Committee

John R. Stevens

Committee

Adele Cutler

Committee

Jürgen Symanzik

Abstract

It is a typical feature of high dimensional data analysis, for example a microarray study, that a researcher allows thousands of statistical tests at a time. All inferences for the tests are determined using the p-values; a smaller p-value than the α-level of the test signifies a statistically significant test. As the number of tests increases, the chance of observing some small p-values is very high even when all null hypotheses are true. Consequently, we make wrong conclusions on the hypotheses. This type of potential problem frequently happens when we test several hypotheses simultaneously, i.e., the multiple testing problem. Adjustment of the p-values can redress the problem that arises in multiple hypothesis testing. P-value adjustment methods control error rates [type I error (i.e. false positive) and type II error (i.e. false negative)] for each hypothesis in order to achieve high statistical power while keeping the overall Family Wise Error Rate (FWER) no larger than α, where α is most often set to 0.05. However, researchers also consider the False Discovery Rate (FDR), or Positive False Discovery Rate (pFDR) instead of the type I error in multiple comparison problems for microarray studies. The methods involved in controlling the FDR always provide higher statistical power than the methods involved in controlling the type I error rate while keeping the type II error rate low. In practice, microarray studies involve dependent test statistics (or p-values) because genes can be fully dependent on each other in a complicated biological structure. However, some of the p-value adjustment methods only deal with independent test statistics. Thus, we carry out a simulation study with several methods involved in multiple hypothesis testing. Our result suggests a suitable method given that the test statistics are dependent with a particular covariance structure while allowing different values of the underlying parameters in the alternative hypotheses.

Comments

This work made publicly available electronically on June 13, 2011

Share

COinS