CT 自动间质性肺异常概率预测:波士顿肺癌研究中的逐步机器学习方法。
Automated Interstitial Lung Abnormality Probability Prediction at CT: A Stepwise Machine Learning Approach in the Boston Lung Cancer Study.
发表日期:2024 Sep
作者:
Akinori Hata, Kota Aoyagi, Takuya Hino, Masami Kawagishi, Noriaki Wada, Jiyeon Song, Xinan Wang, Vladimir I Valtchinov, Mizuki Nishino, Yohei Muraguchi, Minoru Nakatsugawa, Akihiro Koga, Naoki Sugihara, Masahiro Ozaki, Gary M Hunninghake, Noriyuki Tomiyama, Yi Li, David C Christiani, Hiroto Hatabu
来源:
RADIOLOGY
摘要:
背景 人们越来越认识到 CT 检测到的间质性肺异常 (ILA) 具有潜在的临床意义,但 ILA 的自动识别尚未完全建立。目的 使用机器学习技术在 CT 图像上开发和测试自动化 ILA 概率预测模型。材料和方法 这项回顾性研究的二次分析包括 2004 年 2 月至 2017 年 6 月期间收集的波士顿肺癌研究中患者的 CT 扫描。两名放射科医生和一名肺科医生对 ILA 的视觉评估作为基本事实。开发了自动 ILA 概率预测模型,该模型使用涉及部分推理和案例推理模型的逐步方法。剖面推断模型为每个 CT 剖面生成 ILA 概率,病例推断模型整合这些概率以生成病例级 ILA 概率。对于不确定的切片和病例,评估了双标签和三标签方法。对于案例推理模型,我们测试了三种机器学习分类器(支持向量机 [SVM]、随机森林 [RF] 和卷积神经网络 [CNN])。进行受试者工作特征分析以计算受试者工作特征曲线下面积(AUC)。结果 总共纳入 1382 例 CT 扫描(患者平均年龄,67 岁 ± 11 [SD];759 名女性)。根据真实标签,在 1382 幅 CT 扫描中,104 幅 (8%) 被评估为患有 ILA,492 幅 (36%) 被评估为 ILA 不确定,786 幅 (57%) 被评估为无 ILA。该队列被分为训练集(n = 96;ILA,n = 48)、验证集(n = 24;ILA,n = 12)和测试集(n = 1262;ILA,n = 44) 。在评估的模型中(二标签和三标签切片推理模型;二标签和三标签SVM、RF和CNN案例推理模型),切片推理模型中使用三标签方法的模型和二标签方法的模型案例推理模型中的方法和 RF 取得了最高的 AUC,为 0.87。结论 该模型在估计 ILA 概率方面表现出良好的性能,表明其在临床环境中的潜在实用性。 © RSNA,2024 本文提供补充材料。另请参阅本期 Zagurovskaya 的社论。
Background It is increasingly recognized that interstitial lung abnormalities (ILAs) detected at CT have potential clinical implications, but automated identification of ILAs has not yet been fully established. Purpose To develop and test automated ILA probability prediction models using machine learning techniques on CT images. Materials and Methods This secondary analysis of a retrospective study included CT scans from patients in the Boston Lung Cancer Study collected between February 2004 and June 2017. Visual assessment of ILAs by two radiologists and a pulmonologist served as the ground truth. Automated ILA probability prediction models were developed that used a stepwise approach involving section inference and case inference models. The section inference model produced an ILA probability for each CT section, and the case inference model integrated these probabilities to generate the case-level ILA probability. For indeterminate sections and cases, both two- and three-label methods were evaluated. For the case inference model, we tested three machine learning classifiers (support vector machine [SVM], random forest [RF], and convolutional neural network [CNN]). Receiver operating characteristic analysis was performed to calculate the area under the receiver operating characteristic curve (AUC). Results A total of 1382 CT scans (mean patient age, 67 years ± 11 [SD]; 759 women) were included. Of the 1382 CT scans, 104 (8%) were assessed as having ILA, 492 (36%) as indeterminate for ILA, and 786 (57%) as without ILA according to ground-truth labeling. The cohort was divided into a training set (n = 96; ILA, n = 48), a validation set (n = 24; ILA, n = 12), and a test set (n = 1262; ILA, n = 44). Among the models evaluated (two- and three-label section inference models; two- and three-label SVM, RF, and CNN case inference models), the model using the three-label method in the section inference model and the two-label method and RF in the case inference model achieved the highest AUC, at 0.87. Conclusion The model demonstrated substantial performance in estimating ILA probability, indicating its potential utility in clinical settings. © RSNA, 2024 Supplemental material is available for this article. See also the editorial by Zagurovskaya in this issue.