The use of artificial intelligence based chat bots in ophthalmology triage

Eye (Lond). 2024 Nov 26. doi: 10.1038/s41433-024-03488-1. Online ahead of print.

Abstract

Purpose: To evaluate AI-based chat bots ability to accurately answer common patient's questions in the field of ophthalmology.

Methods: An experienced ophthalmologist curated a set of 20 representative questions and responses were sought from two AI generative models: OpenAI's ChatGPT and Google's Bard (Gemini Pro). Eight expert ophthalmologists from different sub-specialties assessed each response, blinded to the source, and ranked them by three metrics-accuracy, comprehensiveness, and clarity, on a 1-5 scale.

Results: For accuracy, ChatGPT scored a median of 4.0, whereas Bard scored a median of 3.0. In terms of comprehensiveness, ChatGPT achieved a median score of 4.5, compared to Bard which scored a median of 3.0. Regarding clarity, ChatGPT maintained a higher score with a median of 5.0, compared to Bard's median score of 4.0. All comparisons were statistically significant (p < 0.001).

Conclusion: AI-based chat bots can provide relatively accurate and clear responses for addressing common ophthalmological inquiries. ChatGPT surpassed Bard in all measured metrics. While these AI models exhibit promise, further research is indicated to improve their performance and allow them to be used as a reliable medical tool.