Purpose: This study evaluates a large language model, Generative Pre-trained Transformer 4 with vision, for diagnosing vitreoretinal diseases in real-world ophthalmology settings.
Methods: A retrospective cross-sectional study at Bascom Palmer Eye Clinic, analyzing patient data from January 2010 to March 2023, assesses Generative Pre-trained Transformer 4 with vision's performance on retinal image analysis and International Classification of Diseases 10th revision coding across 2 patient groups: simpler cases (Group A) and complex cases (Group B) requiring more in-depth analysis. Diagnostic accuracy was assessed through open-ended questions and multiple-choice questions independently verified by three retina specialists.
Results: In 256 eyes from 143 patients, Generative Pre-trained Transformer 4-V demonstrated a 13.7% accuracy for open-ended questions and 31.3% for multiple-choice questions, with International Classification of Diseases 10th revision code accuracies at 5.5% and 31.3%, respectively. Accurately diagnosed posterior vitreous detachment, nonexudative age-related macular degeneration, and retinal detachment. International Classification of Diseases 10th revision coding was most accurate for nonexudative age-related macular degeneration, central retinal vein occlusion, and macular hole in OEQs, and for posterior vitreous detachment, nonexudative age-related macular degeneration, and retinal detachment in multiple-choice questions. No significant difference in diagnostic or coding accuracy was found in Groups A and B.
Conclusion: Generative Pre-trained Transformer 4 with vision has potential in clinical care and record keeping, particularly with standardized questions. Its effectiveness in open-ended scenarios is limited, indicating a significant limitation in providing complex medical advice.