Direct classification of high-dimensional data in low-dimensional projected feature spaces--comparison of several classification methodologies

J Biomed Inform. 2007 Apr;40(2):131-8. doi: 10.1016/j.jbi.2006.04.001. Epub 2006 Apr 20.

Abstract

Previously, we introduced a distance (similarity)-based mapping for the visualization of high-dimensional patterns and their relative relationships. The mapping preserves exactly the original distances from all points to any two reference patterns in a special two-dimensional coordinate system, the relative distance plane (RDP). We extend the RDP mapping's applicability from visualization to classification. Several of the classifiers use the RDP directly. These include the standard linear discriminant analysis (LDA), nearest neighbor classifiers, and a transvariation probabilities-based classification method that is natural in the RDP. Several reference directions can also be combined to create new coordinate systems in which arbitrary classifiers can be developed. We obtain increased confidence in the classification results by cycling through all possible reference pairs and computing a misclassification-based weighted accuracy. The classification results on several high-dimensional biomedical datasets are compared.

Publication types

  • Comparative Study
  • Evaluation Study

MeSH terms

  • Algorithms*
  • Artificial Intelligence*
  • Computer Graphics*
  • Computer Simulation
  • Models, Biological*
  • Pattern Recognition, Automated / methods*
  • User-Computer Interface*