Use of large language models in turkish information materials for glaucoma patient education: evaluation of readability, accuracy and comprehensiveness

Dal, Ali; Erdag, Murat; Dikme, Betül; Kutluksaman, Bünyamin

pdf

Volume : 6 Issue : 1 Year : 2026

6/1Current Issue Ahead of Print Archive Covers Most Accessed Articles

Author Contribution Form

ICMJE COI Form

Use of large language models in turkish information materials for glaucoma patient education: evaluation of readability, accuracy and comprehensiveness [Eur Eye Res]

Eur Eye Res. 2026; 6(1): 60-69 | DOI: 10.14744/eer.2025.93723

Use of large language models in turkish information materials for glaucoma patient education: evaluation of readability, accuracy and comprehensiveness

Ali Dal¹, Murat Erdag², Betül Dikme¹, Bünyamin Kutluksaman¹
¹Department of Ophthalmology, Tayfur Ata Sokmen Faculty of Medicine, Mustafa Kemal University, Hatay, Turkiye
²Department of Ophthalmology, Fırat Faculty of Medicine, Fırat University, Elazığ, Turkiye

PURPOSE: This study aims to evaluate the readability of the Turkish Ophthalmology Association’s (TOA) glaucoma patient education brochure and to assess the capabilities of GPT-4.0, Gemini, and DeepSeek in generating Turkish patient education materials with respect to readability, accuracy, and comprehensiveness.
METHODS: The TOA’s patient education brochure on glaucoma was evaluated for readability using the Ateşman and Bezirci-Yilmaz formulae. The questions from the TOA booklets were presented independently to the GPT-4.0, Gemini, and DeepSeek models. The replies generated by these models were readability tested using the same formulas. In addition, qualified ophthalmologists evaluated the accuracy and comprehensiveness of the artificial intelligence (AI)-generated responses. AI-generated responses were converted to Q1 and Q2 formats to test text simplification. These versions were reevaluated for readability, accuracy, and comprehensiveness to see if simplification increased intelligibility without affecting medical accuracy.
RESULTS: The TOA brochure had a higher readability level than the recommended patient education standard. Bezirci-Yilmaz scores showed that Gemini and DeepSeek had significantly lower readability than the TOA brochure (p=0.007 and p=0.033, respectively), whereas GPT-4.0 showed no significant difference (p=0.077). Ateşman scores indicated no significant difference between TOA and AI-generated texts. Gemini showed significantly higher comprehensiveness than GPT-4.0 (p=0.042), whereas accuracy scores did not differ significantly among the models. Readability improved for Gemini following simplification (p=0.013 and p=0.005, respectively), whereas GPT 4.0 and DeepSeek remained unchanged. After simplification, the comprehensiveness score decreased for Gemini, whereas GPT-4.0 and DeepSeek maintained their comprehensiveness.
CONCLUSION: While large language models hold promise for use as glaucoma patient information materials, it is essential to rigorously evaluate the accuracy and comprehensiveness of the content they produce.

Keywords: Glaucoma, large language models, readability.

Corresponding Author: Ali Dal, Türkiye
Manuscript Language: English

CITE

Full Text PDF Download citation RIS EndNote BibTex Medlars Procite Reference Manager Send email to author Similar articles PubMed Google Scholar

Quick Search

Use of large language models in turkish information materials for glaucoma patient education: evaluation of readability, accuracy and comprehensiveness