Automated feature selection for early keratoconus screening optimization

Biomed Phys Eng Express. 2024 Dec 20;11(1). doi: 10.1088/2057-1976/ad9c7e.

Abstract

In this paper, an automated feature selection (FS) method is presented to optimize machine learning (ML) models' performances, enhancing early keratoconus screening. A total of 448 parameters were analyzed from a dataset comprising 3162 observations sourced from the swept source optical coherence tomography imaging system developed by the Chinese Academy of Sciences Institute of Automation (SS-1000 CASIA OCT) and electronic health records (EHR). To identify the most relevant features, the analysis of variance (ANOVA) method was used in this study. The performance of three classifiers namely K-Nearest Neighbors (KNN), Support Vector Machine (SVM), and Artificial Neural Networks (ANN) was evaluated, yielding classification accuracies of 96.79% and 96.68% for KNN, 98.95% and 97.08% for SVM, and 95.64% and 95.62% for ANN when distinguishing between 2 and 4 keratoconus classes, respectively. The results show that selecting 50 features can significantly improve the performance of the model while reducing the computation time. The automated feature selection method can also assist ophthalmologists in better understanding the links between various ocular characteristics and keratoconus, potentially leading to advances in early diagnosis, risk prediction, and clinical management of this condition.

Keywords: classification; features selection; machine learning; optimization; overvifitting.

MeSH terms

  • Adult
  • Algorithms
  • Automation
  • Cornea / diagnostic imaging
  • Cornea / pathology
  • Electronic Health Records
  • Female
  • Humans
  • Image Processing, Computer-Assisted / methods
  • Keratoconus* / diagnosis
  • Keratoconus* / diagnostic imaging
  • Machine Learning
  • Male
  • Neural Networks, Computer*
  • Support Vector Machine*
  • Tomography, Optical Coherence* / methods