Date of Award:
Doctor of Philosophy (PhD)
Mathematics and Statistics
Rose Qingyang Hu
John R. Stevens
D. Richard Cutler
Data for which the number of predictors exponentially exceeds the number of observations is becoming increasingly prevalent in fields such as bioinformatics, medical imaging, computer vision, And social network analysis. One of the leading questions statisticians must answer when confronted with such “big data” is how to reduce a set of exponentially many predictors down to a set of a mere few predictors which have a truly causative effect on the response being modelled. This process is often referred to as feature screening. In this work we propose three new methods for feature screening. The first method we propose (TC-SIS) is specifically intended for use with data having both categorical response and predictors. The second method we propose (JCIS) is meant for feature screening for interactions between predictors. JCIS is rare among interaction screening methods in that it does not require first finding a set of causative main effects before screening for interactive effects. Our final method (GenCorr) is intended for use with data having a multivariate response. GenCorr is the only method for multivariate screening which can screen for both causative main effects and causative interactions. Each of these aforementioned methods will be shown to possess both theoretical robustness as well as empirical agility.
Reese, Randall D., "Feature Screening of Ultrahigh Dimensional Feature Spaces With Applications in Interaction Screening" (2018). All Graduate Theses and Dissertations. 7231.
Copyright for this work is retained by the student. If you have any questions regarding the inclusion of this work in the Digital Commons, please email us at .