In this paper, an automated feature selection (FS) method is presented to optimize machine learning (ML) models' performances, enhancing early keratoconus screening. A total of 448 parameters were analyzed from a dataset comprising 3162 observations sourced from the swept source optical coherence tomography imaging system developed by the Chinese Academy of Sciences Institute of Automation (SS-1000 CASIA OCT) and electronic health records (EHR). To identify the most relevant features, the analysis of variance (ANOVA) method was used in this study. The performance of three classifiers namely K-Nearest Neighbors (KNN), Support Vector Machine (SVM), and Artificial Neural Networks (ANN) was evaluated, yielding classification accuracies of 96.79% and 96.68% for KNN, 98.95% and 97.08% for SVM, and 95.64% and 95.62% for ANN when distinguishing between 2 and 4 keratoconus classes, respectively. The results show that selecting 50 features can significantly improve the performance of the model while reducing the computation time. The automated feature selection method can also assist ophthalmologists in better understanding the links between various ocular characteristics and keratoconus, potentially leading to advances in early diagnosis, risk prediction, and clinical management of this condition.
Keywords: classification; features selection; machine learning; optimization; overvifitting.
© 2024 IOP Publishing Ltd. All rights, including for text and data mining, AI training, and similar technologies, are reserved.