人工智能聊天机器人对泌尿系统恶性肿瘤的热门搜索查询的回应效果如何？

How Well Do Artificial Intelligence Chatbots Respond to the Top Search Queries About Urological Malignancies?

Original text

发表日期：2023 Aug 09

作者： David Musheyev, Alexander Pan, Stacy Loeb, Abdo E Kabarriti

来源： EUROPEAN UROLOGY

摘要：

人工智能（AI）聊天机器人正在成为一种流行的信息来源，但关于泌尿系恶性肿瘤的信息质量却有限。我们的目标是评估ChatGPT、Perplexity、Chat Sonic和Microsoft Bing AI这四个AI聊天机器人在前列腺癌、膀胱癌、肾癌和睾丸癌方面提供的信息质量，并检测其是否存在误导性信息。我们根据Google Trends提供的2021年1月至2023年1月期间与前列腺癌、膀胱癌、肾癌和睾丸癌相关的前五个搜索查询，在AI聊天机器人中输入这些查询内容。我们使用已发表的评估工具对回应进行了信息质量、可理解性、可操作性、误导性和可读性的评估。AI聊天机器人的回应在信息质量方面表现为中等至高水平（DISCERN评分中位数为4/5，范围2-5），并且缺乏误导性信息。可理解性为中等水平（Patient Education Material Assessment Tool for Printable Materials [PEMAT-P] 可理解性中位数为66.7%，范围44.4%-90.9%），可操作性为中等至差（PEMAT-P可操作性中位数为40%，范围0%-40%）。回应的可读性较为困难。针对前列腺癌、膀胱癌、肾癌和睾丸癌的热门搜索查询，AI聊天机器人产生的信息一般准确且质量中等至高，但回应缺乏清晰、可操作的指导，且超过了消费者健康信息的推荐阅读水平。病患摘要：针对与泌尿系恶性肿瘤相关的广受欢迎的Google搜索，人工智能聊天机器人提供的信息一般准确且质量中等偏高。然而，它们的回应较难阅读、可理解程度中等，并且缺乏用户采取行动的清晰指导。版权©2023年欧洲泌尿外科协会。由Elsevier B.V.出版。版权所有。

Artificial intelligence (AI) chatbots are becoming a popular source of information but there are limited data on the quality of information on urological malignancies that they provide. Our objective was to characterize the quality of information and detect misinformation about prostate, bladder, kidney, and testicular cancers from four AI chatbots: ChatGPT, Perplexity, Chat Sonic, and Microsoft Bing AI. We used the top five search queries related to prostate, bladder, kidney, and testicular cancers according to Google Trends from January 2021 to January 2023 and input them into the AI chatbots. Responses were evaluated for quality, understandability, actionability, misinformation, and readability using published instruments. AI chatbot responses had moderate to high information quality (median DISCERN score 4 out of 5, range 2-5) and lacked misinformation. Understandability was moderate (median Patient Education Material Assessment Tool for Printable Materials [PEMAT-P] understandability 66.7%, range 44.4-90.9%) and actionability was moderate to poor (median PEMAT-P actionability 40%, range 0-40%The responses were written at a fairly difficult reading level. AI chatbots produce information that is generally accurate and of moderate to high quality in response to the top urological malignancy-related search queries, but the responses lack clear, actionable instructions and exceed the reading level recommended for consumer health information. PATIENT SUMMARY: Artificial intelligence chatbots produce information that is generally accurate and of moderately high quality in response to popular Google searches about urological cancers. However, their responses are fairly difficult to read, are moderately hard to understand, and lack clear instructions for users to act on.Copyright © 2023 European Association of Urology. Published by Elsevier B.V. All rights reserved.