Class

Article

College

College of Science

Department

Mathematics and Statistics Department

Faculty Mentor

D. Richard Cutler

Presentation Type

Poster Presentation

Abstract

Random Forests are a widely used predictive technique in the modern data analyst’s toolkit. As with other machine learning methods, Random Forests have hyper-parameters that should be tuned for getting the best predictive accuracy and for interpretation. Variable importance measures give users valuable insights regarding which features are most informative for prediction. The subject of my research is the commonly used permutation importance algorithm with Random Forests. Key results of my research are: 1. When predictive features are highly correlated, importance values can be misleading. 2. The best choice of the Random Forests hyper-parameter mtry for importances may be quite different from the best mtry for prediction, especially when features are highly correlated. When correlated features are byproducts of each other, then using larger values of mtry gives superior importance values. 3. The square root of importance values is a better measure than the raw values.4. A collection of importances, accuracy, and association measures is more helpful than a single tuning measure. I implemented plots and measures associated with the results above in a package for the R programming language to assist users of Random Forests. Ultimately, it helps analysts tune Random Forests based on variable importance information as well as predictive accuracy.

Location

Logan, UT

Start Date

4-12-2023 12:30 PM

End Date

4-12-2023 1:30 PM

Download

Included in

Mathematics Commons

COinS

Apr 12th, 12:30 PM Apr 12th, 1:30 PM

Tuning Random Forests for Interpretability

Logan, UT

Research Week 2023

Tuning Random Forests for Interpretability

Class

College

Department

Faculty Mentor

Presentation Type

Abstract

Location

Start Date

End Date

Included in

Browse

For Authors

Scholarly Communication

Research Data

Research Week 2023

Tuning Random Forests for Interpretability

Presenter Information

Class

College

Department

Faculty Mentor

Presentation Type

Abstract

Location

Start Date

End Date

Included in

Share

Browse

For Authors

Scholarly Communication

Research Data