P486 - LARGE LANGUAGE MODELS FOR DIETARY ADVICE TO PATIENTS WITH IRRITABLE BOWEL SYNDROME: CHATGPT VS. GOOGLE GEMINI

P486

LARGE LANGUAGE MODELS FOR DIETARY ADVICE TO PATIENTS WITH IRRITABLE BOWEL SYNDROME: CHATGPT VS. GOOGLE GEMINI

M. Kip^1,*, G. Saban Bozan¹, G. Aytekin Şahin¹

¹Nutrition and Dietetics, Nuh Naci Yazgan University, Kayseri, Türkiye

Rationale: Large Language Models (LLM) like ChatGPT and Gemini have potential in nutrition applications, but recent studies suggest they provide inaccurate dietary advice. The aim of this study was to evaluate the most commonly used LLMs, ChatGPT and Gemini, for dietary recommendations for patients with irritable bowel syndrome (IBS).

Methods: Various tools were used to assess the responses of LLMs in this study. The Guideline Compliance Score was created using IBS guidelines. The quality of the responses provided by LLMs was assessed using The Global Quality Score (GQS) and CLEAR tool. Understandability and actionability were assessed using the Patient Education Materials Assessment Tool (PEMAT). The readability of ChatGPT and Gemini's responses was evaluated using Flesch Reading Ease (FRE) and Flesch Kincaid Grade Level (FKGL).

Results: This study found that most responses from ChatGPT (70%) and Gemini (57.5%) were compliant with the guidelines, but there was no significant difference in guideline compliance, quality, understandability, actionability, or readability scores (p>0.05). The CLEAR tool showed a moderate positive correlation with PEMAT actionability (r=0.467, p=0.038) and understandability (r=0.568, p=0.009), a strong positive correlation with GQS (r=0.611, p=0.004). In addition, FRE and FKGL had a strong negative correlation (r=-0.784, p<0.001), while the Guideline Compliance Score showed a moderate negative correlation with FRE (r=-0.537, p=0.015).

Conclusion: The study emphasizes the need for further model improvements before relying solely on LLMs in clinical nutrition practice, emphasizing the importance of dietitians' recommendations and the collaboration between AI models and healthcare teams.

Disclosure of Interest: None declared