Date of Award

5-2007

Degree Type

Report

Degree Name

Master of Science (MS)

Department

Mathematics and Statistics

Committee Chair(s)

John R. Stevens

Committee

John R. Stevens

Abstract

Oligonucleotide arrays are used in many applications. Affymetrix GeneChip arrays are widely used. Before researchers can use the information from these arrays, the raw data must be transformed and summarized into a more meaningful and usable form. One of the more popular methods for doing so is RMA (Robust Multi-array Analysis).

A problem with RMA is that the end result (estimated gene expression levels) is based on a fairly complicated process that is unusual. Specifically, there is no closed-form estimate of standard errors for the estimated gene expression levels. The current recommendation is to use a naive estimate for the SE that is based on a simple ANOVA model. This results in an estimated SE that is the same for all arrays even when there is reason to believe they should be different.

This paper investigates a computationally efficient implementation of bootstrapping as a way to get a valid estimate of the standard errors of RMA expression levels. Oligonucleotide arrays contain a lot of data, and processing that data already carries a significant cost in computation. Bootstrapping compounds this cost. Consequently efforts have been made to reduce the required number of resamples while still getting a reasonable estimate of the standard error. The accompanying R function is however flexible enough to do as many resamples as are required; the tradeoff is that more resamples mean more computation time will be needed.

Share

COinS