Can artificial intelligence become a board-certified ophthalmologist? Assessing machine intelligence in ophthalmology education

Kiyat, Pelin; Kahraman, Hazan Gül

pdf

Volume: 5 Issue: 1 Year: 2025

5/1Current Issue Ahead of Print Archive Covers Most Accessed Articles

Author Contribution Form

ICMJE COI Form

Can artificial intelligence become a board-certified ophthalmologist? Assessing machine intelligence in ophthalmology education [Eur Eye Res]

Eur Eye Res. Ahead of Print: EER-27247

Can artificial intelligence become a board-certified ophthalmologist? Assessing machine intelligence in ophthalmology education

Pelin Kiyat, Hazan Gül Kahraman
Department of Ophthalmology, İzmir Democracy University, Buca Seyfi Demirsoy Training and Research Hospital, İzmir, Turkey.

PURPOSE: To evaluate and compare the performance of three leading artificial intelligence (AI) models (ChatGPT 4o, ChatGPT o1, Claude 3.5 Sonnet and Gemini 2.0 Flash Experimental) in answering ophthalmology questions from two different, popular board preparation question resources and to analyze performance variations across subspecialties and resources.
METHODS: From the 398 available questions in the ebodtraining.com question bank, 344 text-based questions were selected and organized to include 35 questions per subspecialty. The same number of questions per subspecialty were randomly selected from eyedocs.co.uk to match those from ebodtraining.com. ChatGPT 4o, ChatGPT o1, Claude 3.5 Sonnet, and Gemini were tested on these questions, with responses evaluated as either correct or wrong, allowing calculation of both overall and subspecialty-specific performance metrics.
RESULTS: Various AI models were evaluated on two ophthalmology question banks: ebodtraining.com (344 questions) and eyedocs.co.uk (345 questions). For ebodtraining.com, ChatGPT o1 achieved 88.0% accuracy, followed by Claude 3.5 Sonnet (84.7%), Gemini (81.7%), and ChatGPT 4o (81.2%), with all models showing weaker performance in Neuroophthalmology section. Similarly, on eyedocs.co.uk, ChatGPT o1 led with 88.4%, while Claude 3.5 Sonnet reached 84.6%, Gemini 79.2%, and ChatGPT 4o 73.4%. ChatGPT o1 significantly outperformed ChatGPT 4o on both platforms and demonstrated higher accuracy across multiple subspecialties compared to Claude 3.5 Sonnet and Gemini.
CONCLUSION: In modern world, time is getting more precious every day and with the help of AI models, students can receive information and explanation rapidly. In addition, with the advantage of asking further questions, students can access personalised answers, reduce time consumption and get a tailored learning experience. However, it should be taken into consideration that although AI models demonstrate promising capabilities in ophthalmology board examination preparation, their performance varies significantly across subspecialties and question types. These tools can serve as valuable supplementary resources for exam preparation, but cannot replace comprehensive clinical training and expertise.

Keywords: Artificial intelligence, ChatGPT, ophthalmology board examinations

Corresponding Author: Pelin Kiyat, Türkiye
Manuscript Language: English

CITE

Download citation RIS EndNote BibTex Medlars Procite Reference Manager Send email to author Similar articles PubMed Google Scholar

Quick Search

Can artificial intelligence become a board-certified ophthalmologist? Assessing machine intelligence in ophthalmology education