Date of Award


Degree Type

Creative Project

Degree Name

Master of Science (MS)


Economics and Finance

Committee Chair(s)

Carly Fox


Carly Fox


Todd Griffith


Pedram Jahangiry


This thesis rigorously evaluates the application of an array of natural language processing (NLP) techniques and machine learning models to identify linguistic signatures indicative of dementia, as sourced from the DementiaBank Pitt corpus. Utilizing a binary classification paradigm, this study meticulously integrates sophisticated embedding methods—including Doc2Vec, Word2Vec, GloVe, and BERT—with traditional machine learning algorithms such as Random Forest, Multinomial Naïve Bayes, ADA boost, KNN classifier, and Logistic Regression, alongside deep learning architectures like LSTM, Bi-LSTM, and CNN-LSTM. The efficacy of these methodologies is evaluated based on their capacity to differentiate between transcribed speech impacted by dementia and that from control subjects. To enhance interpretability, this research also employs feature importance analysis through LIME, SHAP, permutation importance, and integrated gradients, shedding light on the variables most instrumental in driving model predictions. The results of this comprehensive analysis not only illuminate the robust potential of these combined NLP and machine learning approaches in the context of medical screening but also contribute additional valuable insights to the field of NLP and dementia screening specifically.