Document Type
Article
Journal/Book Title/Conference
BMC Bioinformatics
Volume
18
Issue
212
Publisher
BioMed Central
Publication Date
4-12-2017
Abstract
Background
Although the dimension of the entire genome can be extremely large, only a parsimonious set of influential SNPs are correlated with a particular complex trait and are important to the prediction of the trait. Efficiently and accurately selecting these influential SNPs from millions of candidates is in high demand, but poses challenges. We propose a backward elimination iterative distance correlation (BE-IDC) procedure to select the smallest subset of SNPs that guarantees sufficient prediction accuracy, while also solving the unclear threshold issue for traditional feature screening approaches.
Results
Verified through six simulations, the adaptive threshold estimated by the BE-IDC performed uniformly better than fixed threshold methods that have been used in the current literature. We also applied BE-IDC to an Arabidopsis thaliana genome-wide data. Out of 216,130 SNPs, BE-IDC selected four influential SNPs, and confirmed the same FRIGIDA gene that was reported by two other traditional methods.
Conclusions
BE-IDC accommodates both the prediction accuracy and the computational speed that are highly demanded in the genomic selection.
Recommended Citation
Meng, Matthew D.; Wang, Gang; and Brough, Aaron R., "An adaptive threshold determination method of feature screening for genomic selection" (2017). Mathematics and Statistics Faculty Publications. Paper 218.
https://digitalcommons.usu.edu/mathsci_facpub/218