Date of Award:


Document Type:


Degree Name:

Doctor of Philosophy (PhD)


Mathematics and Statistics

Committee Chair(s)

Christopher Corcoran


Christopher Corcoran


Adele Cutler


Kady Schneiter


John Stevens


Ronald Munger


Resolving the interplay of the genetic components of a complex disease is a challenging endeavor. Over the past several years, genome-wide association studies (GWAS) have emerged as a popular approach at locating common genetic variation within the human genome associated with disease risk. Assessing genetic-phenotype associations upon hundreds of thousands of genetic markers using the GWAS approach, introduces the potentially high number of false positive signals and requires statistical correction for multiple hypothesis testing. Permutation tests are considered the gold standard for multiple testing correction in GWAS, because they simultaneously provide unbiased Type I error control and high power. However, they demand heavy computational effort, especially with large-scale data sets of modern GWAS. In recent years, the computational problem has been circumvented by using approximations to permutation tests, but several studies have posed sampling conditions in which these approximations are suggestive to be biased.

We have developed an optimized parallel algorithm for the permutation testing approach to multiple testing correction in GWAS, whose implementation essentially abates the computational problem. When introduced to GWAS data, our algorithm yields rapid, precise, and powerful multiplicity adjustment, many orders of magnitude faster than existing employed GWAS statistical software.

Although GWAS have identified many potentially important genetic associations which will advance our understanding of human disease, the common variants with modest effects on disease risk discovered through this approach likely account for a small proportion of the heritability in complex disease. On the other hand, interactions between genetic and environmental factors could account for a substantial proportion of the heritability in a complex disease and are overlooked within the GWAS approach.

We have developed an efficient and easily implemented tool for genetic association studies, whose aim is identifying genes involved in a gene-environment interaction. Our approach is amenable to a wide range of association studies and assorted densities in sampled genetic marker panels, and incorporates resampling for multiple testing correction. Within the context of a case-control study design we demonstrate by way of simulation that our proposed method offers greater statistical power to detect gene-environment interaction, when compared to several competing approaches to assess this type of interaction.




This work made publicly available electronically on April 12, 2012.