研究动态
Articles below are published ahead of final publication in an issue. Please cite articles in the following format: authors, (year), title, journal, DOI.

评估从放射学报告中提取肺 RADS 评分的准确性:手动输入与自然语言处理。

Evaluating the accuracy of lung-RADS score extraction from radiology reports: Manual entry versus natural language processing.

发表日期:2024 Jul 31
作者: Amir Gandomi, Eusha Hasan, Jesse Chusid, Subroto Paul, Matthew Inra, Alex Makhnevich, Suhail Raoof, Gerard Silvestri, Brett C Bade, Stuart L Cohen
来源: INTERNATIONAL JOURNAL OF MEDICAL INFORMATICS

摘要:

放射学评分系统对于肺癌筛查 (LCS) 计划的成功至关重要,影响患者护理、随访依从性、数据管理和报告以及计划评估。 LungCT 筛查报告和数据系统 (Lung-RADS) 是一种结构化放射学评分系统,为 LCS 随访提供建议,该系统 (a) 用于临床护理,(b) 用于监测随访依从率的 LCS 项目。因此,准确报告和可靠收集 Lung-RADS 评分是 LCS 计划评估和改进的基本组成部分。不幸的是,由于放射学报告的可变性,提取 Lung-RADS 评分并非易事,并且不存在最佳实践。该项目的目的是比较从自由文本放射学报告中提取 Lung-RADS 评分的机制。我们回顾性分析了 1 月份在纽约州多医院综合医疗网络进行的 LCS 低剂量计算机断层扫描 (LDCT) 检查的报告2016 年和 2023 年 7 月。我们比较了 Lung-RADS 评分提取的三种方法:创建报告时医生手动输入、报告创建后 LCS 专家手动输入以及内部开发的基于规则的自然语言处理 (NLP) 算法。对三种方法的准确性、召回率、精确度和完整性(即已分配 Lung-RADS 评分的 LCS 检查的比例)进行了比较。数据集包括 14,243 名独特患者的 24,060 次 LCS 检查。患者的平均年龄为 65 岁,大多数患者为男性 (54%) 和白人 (75%)。放射科医生手动录入、LCS 专家录入和 NLP 算法的完整率分别为 65%、68% 和 99%。尽管基于 NLP 的方法在所有指标中始终高于手动输入,但所有提取方法的准确度、召回率和精确度都很高 (>94%)。基于 NLP 的 LCS 分数确定方法是一种高效且更准确的方法提取 Lung-RADS 分数的方法,而不是手动审查和数据输入。基于 NLP 的方法应被视为从自由文本放射学报告中提取结构化 Lung-RADS 评分的最佳实践。版权所有 © 2024 Elsevier B.V。保留所有权利。
Radiology scoring systems are critical to the success of lung cancer screening (LCS) programs, impacting patient care, adherence to follow-up, data management and reporting, and program evaluation. LungCT ScreeningReporting and Data System (Lung-RADS) is a structured radiology scoring system that provides recommendations for LCS follow-up that are utilized (a) in clinical care and (b) by LCS programs monitoring rates of adherence to follow-up. Thus, accurate reporting and reliable collection of Lung-RADS scores are fundamental components of LCS program evaluation and improvement. Unfortunately, due to variability in radiology reports, extraction of Lung-RADS scores is non-trivial, and best practices do not exist. The purpose of this project is to compare mechanisms to extract Lung-RADS scores from free-text radiology reports.We retrospectively analyzed reports of LCS low-dose computed tomography (LDCT) examinations performed at a multihospital integrated healthcare network in New York State between January 2016 and July 2023. We compared three methods of Lung-RADS score extraction: manual physician entry at time of report creation, manual LCS specialist entry after report creation, and an internally developed, rule-based natural language processing (NLP) algorithm. Accuracy, recall, precision, and completeness (i.e., the proportion of LCS exams to which a Lung-RADS score has been assigned) were compared between the three methods.The dataset includes 24,060 LCS examinations on 14,243 unique patients. The mean patient age was 65 years, and most patients were male (54 %) and white (75 %). Completeness rate was 65 %, 68 %, and 99 % for radiologists' manual entry, LCS specialists' entry, and NLP algorithm, respectively. Accuracy, recall, and precision were high across all extraction methods (>94 %), though the NLP-based approach was consistently higher than both manual entries in all metrics.An NLP-based method of LCS score determination is an efficient and more accurate means of extracting Lung-RADS scores than manual review and data entry. NLP-based methods should be considered best practice for extracting structured Lung-RADS scores from free-text radiology reports.Copyright © 2024 Elsevier B.V. All rights reserved.