Date of Award:

8-2019

Document Type:

Dissertation

Degree Name:

Doctor of Philosophy (PhD)

Department:

Mathematics and Statistics

Committee Chair(s)

D. Richard Cutler

Committee

D. Richard Cutler

Committee

Chris D. Corcoran

Committee

Adele Cutler

Committee

Zachariah Gompert

Committee

Jürgen Symanzik

Abstract

Machine learning is a buzz word that has inundated popular culture in the last few years. This is a term for a computer method that can automatically learn and improve from data instead of being explicitly programmed at every step. Investigations regarding the best way to create and use these methods are prevalent in research. Machine learning models can be difficult to create because models need to be tuned. This dissertation explores the characteristics of tuning three popular machine learning models and finds a way to automatically select a set of tuning parameters. This information was used to create an R software package called EZtune that can be used to automatically tune three widely used machine learning algorithms: support vector machines, gradient boosting machines, and adaboost.

The second portion of this dissertation investigates the implementation of machine learning methods in finding locations along a genome that are associated with a trait. The performance of methods that have been commonly used for these types of studies, and some that have not been commonly used, are assessed using simulated data. The affect of the strength of the relationship between the genetic code and the trait is of particular interest. It was found that the strength of this relationship was the most important characteristic in the efficacy of each method.

Checksum

efe6239d87d71a86fae687b56eee5952

Share

COinS