ChatGPT 作为放射肿瘤学患者和提供者资源的当前优势和劣势。
Current Strengths and Weaknesses of ChatGPT as a Resource for Radiation Oncology Patients and Providers.
发表日期:2024 Mar 15
作者:
Warren Floyd, Troy Kleber, David J Carpenter, Melisa Pasli, Jamiluddin Qazi, Christina Huang, Jim Leng, Bradley G Ackerson, Matthew Pierpoint, Joseph K Salama, Matthew J Boyer
来源:
Int J Radiat Oncol
摘要:
聊天生成预训练变压器 (ChatGPT) 是一种人工智能程序,它使用自然语言处理来生成对问题或输入的对话式响应,越来越多地被患者和医疗保健专业人员使用。本研究旨在评估 ChatGPT 在放射肿瘤学相关领域的准确性和全面性,包括回答常见患者问题、总结具有里程碑意义的临床研究,并提供文献综述以及支持当前放射肿瘤学护理标准临床实践的具体参考文献。我们评估了 ChatGPT 3.5 版 (ChatGPT3.5) 在 3 个方面的性能。我们评估了 ChatGPT3.5 回答 9 种癌症类型中 28 个以患者为中心的模板化问题的能力。然后,我们测试了 ChatGPT3.5 总结放射肿瘤学 10 项里程碑式研究的特定部分的能力。接下来,我们使用 ChatGPT3.5 来确定支持 5 种不同癌症类型的临床放射肿瘤学当前护理标准实践的科学研究。每个回复均由 2 位审阅者独立评分,不一致的评分由第三位审阅者解决。ChatGPT3.5 经常生成不准确或不完整的回复。只有 39.7% 对以患者为中心的问题的回答被认为是正确且全面的。在总结放射肿瘤学领域的里程碑式研究时,ChatGPT3.5 的回答中有 35.0% 是准确和全面的,在提供研究全文后,这一比例提高到 43.3%。 ChatGPT3.5 提供与护理标准临床实践相关的研究列表的能力也不能令人满意,所提供的研究中有 50.6% 是捏造的。目前,ChatGPT 不应被视为患者或提供者的可靠放射肿瘤学资源,因为它经常产生不准确或不完整的响应。然而,基于自然语言编程的人工智能程序正在迅速发展,ChatGPT 或类似程序的未来版本可能会在该领域表现出改进的性能。版权所有 © 2023 Elsevier Inc. 保留所有权利。
Chat Generative Pre-Trained Transformer (ChatGPT), an artificial intelligence program that uses natural language processing to generate conversational-style responses to questions or inputs, is increasingly being used by both patients and health care professionals. This study aims to evaluate the accuracy and comprehensiveness of ChatGPT in radiation oncology-related domains, including answering common patient questions, summarizing landmark clinical research studies, and providing literature reviews with specific references supporting current standard-of-care clinical practice in radiation oncology.We assessed the performance of ChatGPT version 3.5 (ChatGPT3.5) in 3 areas. We evaluated ChatGPT3.5's ability to answer 28 templated patient-centered questions applied across 9 cancer types. We then tested ChatGPT3.5's ability to summarize specific portions of 10 landmark studies in radiation oncology. Next, we used ChatGPT3.5 to identify scientific studies supporting current standard-of-care practice in clinical radiation oncology for 5 different cancer types. Each response was graded independently by 2 reviewers, with discordant grades resolved by a third reviewer.ChatGPT3.5 frequently generated inaccurate or incomplete responses. Only 39.7% of responses to patient-centered questions were considered correct and comprehensive. When summarizing landmark studies in radiation oncology, 35.0% of ChatGPT3.5's responses were accurate and comprehensive, improving to 43.3% when provided the full text of the study. ChatGPT3.5's ability to present a list of studies related to standard-of-care clinical practices was also unsatisfactory, with 50.6% of the provided studies fabricated.ChatGPT should not be considered a reliable radiation oncology resource for patients or providers at this time, as it frequently generates inaccurate or incomplete responses. However, natural language programming-based artificial intelligence programs are rapidly evolving, and future versions of ChatGPT or similar programs may demonstrate improved performance in this domain.Copyright © 2023 Elsevier Inc. All rights reserved.