Label noise is a common and important issue that would affect the model's performance in artificial intelligence. This study assessed the effectiveness and potential risks of automated label cleaning using an open-source framework, Cleanlab, in multi-category datasets of fundus photography and optical coherence tomography, with intentionally introduced label noise ranging from 0 to 70%. After six cycles of automatic cleaning, significant improvements are achieved in label accuracies (3.4-62.9%) and dataset quality scores (DQS, 5.1-74.4%). The majority (86.6 to 97.5%) of label errors were accurately modified, with minimal missed (0.5-2.8%) or misclassified (0.4-10.6%). The classification accuracy of RETFound significantly improved by 0.3-52.9% when trained with the datasets after cleaning. We also developed a DQS-guided cleaning strategy to mitigate over-cleaning. Furthermore, external validation on EyePACS and APTOS-2019 datasets boosted label accuracy by 1.3 and 1.8%, respectively. This approach automates label correction, enhances dataset reliability, and strengthens model performance efficiently and safely.
© 2025. The Author(s).