Original Article ROC-supervised principal component analysis in connection with the diagnosis of diseases
Jason B. Nikas, Walter C. Low
Department of Neurosurgery, Pharmaco-Neuro-Immunology Program, Graduate Program in Neuroscience, Department of Integrative Biology and Physiology, Institute for Translational Neuroscience, Center for Neuroengineering, Medical School, University of Minnesota, Minneapolis, MN, USA.
Received January 12, 2011; Accepted February 1, 2011; Epub February 3, 2011; Published February 15, 2011
Abstract: Principal component analysis (PCA) is a data analysis method that can deal with large volumes of data. Owing to the complexity and volume of the data generated by today’s advanced technologies in genomics, proteomics, and metabolomics, PCA has become predominant in the medical sciences. Despite its popularity, PCA leaves much to be desired in terms of accuracy and may not be suitable for certain medical applications, such as diagnostics, where accuracy is paramount. In this study, we introduced a new PCA method, one that is carefully supervised by receiver operating characteristic (ROC) curve analysis. In order to assess its performance with respect to its ability to render an accurate differential diagnosis, and to compare its performance with that of standard PCA, we studied the striatal metabolomic profile of R6/2 Huntington disease (HD) transgenic mice, as well as that of wild type (WT) mice, using high field in vivo proton nuclear magnetic resonance (NMR) spectroscopy (9.4-Tesla). We tested both the standard PCA and our ROC-supervised PCA (using in each case both the covariance and the correlation matrix), 1) with the original R6/2 HD mice and WT mice, 2) with unknown mice, whose status had been determined via genotyping, and 3) with the ability to separate the original R6/2 mice into the two age subgroups (8 and 12 wks old). Only our ROC-supervised PCA (both with the covariance and the correlation matrix) passed all tests with a total accuracy of 100%; thus, providing evidence that it may be used for diagnostic purposes. (AJTR1101001).
Keywords: Diagnostic methods, principal Component Analysis; Receiver Operating Characteristic (ROC) Curve Analysis; Metabolomics; Nuclear Magnetic Resonance Spectroscopy; Huntington disease
Address all correspondence to: Dr. Jason B. Nikas Department of Neurosurgery, Medical School University of Minnesota 4-218 MTRF 2001 Sixth St., SE Minneapolis, MN 55455, USA T: 612-625-2868 / F: 612-626-9201 E-mail: nikas001@umn.edu