Date of Award
8-2018
Degree Type
Report
Degree Name
Master of Science (MS)
Department
Mathematics and Statistics
Committee Chair(s)
Adele Cutler
Committee
Adele Cutler
Committee
Richard Cutler
Committee
John Stevens
Abstract
The Random Forest method is a useful machine learning tool developed by Leo Breiman. There are many existing implementations across different programming languages; the most popular of which exist in R, SAS, and Python. In this paper, we conduct a comprehensive comparison of these implementations with regards to the accuracy, variable importance measurements, and timing. This comparison was done on a variety of real and simulated data with different classification difficulty levels, number of predictors, and sample sizes. The comparison shows unexpectedly different results between the three implementations.
Recommended Citation
Soifua, Breckell, "A Comparison of R, SAS, and Python Implementations of Random Forests" (2018). All Graduate Plan B and other Reports, Spring 1920 to Spring 2023. 1268.
https://digitalcommons.usu.edu/gradreports/1268
Included in
Copyright for this work is retained by the student. If you have any questions regarding the inclusion of this work in the Digital Commons, please email us at .