Date of Award
Master of Science (MS)
Mathematics and Statistics
Random forests are ensembles of trees that give accurate predictions for regression, classification and clustering problems. The CART tree, the base learn er employed by random forests, has been criticized because of bias in the selection of splitting variables. The performance of random forests is suspect due to this criticism. A new implementation of random forests, Cforest, which is claimed to outperform random forests in both predictive power and variable importance measures , was developed based on Ctree, an implementation of conditional inference trees.
We address the underlying mechanism of random forests and Cforest in this report. Comparison of random forests and Cforest is presented based on simulated data. Our study shows that except for some extreme situations, with proper choice of tuning parameter values, random forests provides higher prediction accuracies and more reliable variable importance measures than Cforest.
Xia, Rong, "Comparison of Random Forests and Cforest: Variable Importance Measures and Prediction Accuracies" (2009). All Graduate Plan B and other Reports. 1255.
Copyright for this work is retained by the student. If you have any questions regarding the inclusion of this work in the Digital Commons, please email us at .