Date of Award

8-2018

Degree Type

Report

Degree Name

Master of Science (MS)

Department

Mathematics and Statistics

First Advisor

Adele Cutler

Second Advisor

Richard Cutler

Third Advisor

John Stevens

Abstract

The Random Forest method is a useful machine learning tool developed by Leo Breiman. There are many existing implementations across different programming languages; the most popular of which exist in R, SAS, and Python. In this paper, we conduct a comprehensive comparison of these implementations with regards to the accuracy, variable importance measurements, and timing. This comparison was done on a variety of real and simulated data with different classification difficulty levels, number of predictors, and sample sizes. The comparison shows unexpectedly different results between the three implementations.

Share

COinS