Effect of PCA Centering and Scaling on Classification of Mycobacteria from Raman Spectra
Document Type
Article
Journal/Book Title
Applied Spectroscopy
Publication Date
11-25-2016
Volume
71
Issue
6
Abstract
Raman spectroscopy has been used for decades to detect and identify biological substances as it provides specific molecular information. Spectra collected from biological samples are often complex, requiring the aid of data truncation techniques such as principal component analysis (PCA) and multivariate classification methods. Classification results depend on the proper selection of principal components (PCs) and how PCA is performed (scaling and/or centering). There are also guidelines for choosing the optimal number of PCs such as a scree plot, Kaiser criterion, or cumulative percent variance. The goal of this research is to evaluate these methods for best implementation of PCA and PC selection to classify Raman spectra of bacteria. Raman spectra of three different isolates of mycobacteria ( Mycobacterium sp. JLS, Mycobacterium sp. KMS, Mycobacterium sp. MCS) were collected and then passed through PCA and linear discriminant analysis for classification. Principal component analysis implementation as well as PC selection was evaluated by comparing the highest possible classification accuracies against accuracies determined by PC selection methods for each centering and scaling option. Centered and unscaled data provided the best results when selecting PCs based on cumulative percent variance.
First Page
1249
Last Page
1255
Recommended Citation
Hanson C*, Sieverts M**, E Vargis+. Effect of PCA centering and scaling on classification of mycobacteria from Raman spectra. Applied Spectroscopy, 71: (6), 1249-1255 pdf