多模态人工智能结合临床和影像输入改善前列腺癌检测。

Multimodal AI Combining Clinical and Imaging Inputs Improves Prostate Cancer Detection.

Original text

发表日期：2024 Jul 29

作者： Christian Roest, Derya Yakar, Dorjan Ivan Rener Sitar, Joeran S Bosma, Dennis B Rouw, Stefan Johannes Fransen, Henkjan Huisman, Thomas C Kwee

来源： INVESTIGATIVE RADIOLOGY

摘要：

用于通过磁共振成像 (MRI) 检测具有临床意义的前列腺癌 (csPCa) 的深度学习 (DL) 研究常常忽视潜在的相关临床参数，例如前列腺特异性抗原、前列腺体积和年龄。本研究探讨了临床参数和基于 MRI 的 DL 的整合，以提高 MRI 对 csPCa 的诊断准确性。我们回顾性分析了 2 个机构对疑似 csPCa (ISUP ≥2) 进行的 932 例双参数前列腺 MRI 检查。每次 MRI 扫描均由先前开发的 DL 模型自动分析，以检测和分割 csPCa 病变。提取了三组特征：所有 DL 检测到的病变的 DL 病变怀疑级别、临床参数（前列腺特异性抗原、前列腺体积、年龄）和基于 MRI 的病变体积。针对每个特征集组合训练了六个多模态人工智能 (AI) 分类器，采用早期（特征级）和后期（决策级）信息融合方法。每个模型的诊断性能均在中心 1 的 20% 数据上进行内部测试，在中心 2 数据上进行外部测试 (n = 529)。接收器操作特性比较确定了最佳特征组合和信息融合方法，并评估了多模态分析与单模态分析的优势。将最佳模型性能与使用 PI-RADS 的放射科医生进行比较。在内部，通过早期融合将 DL 怀疑水平与临床特征相结合的多模态 AI 实现了最高性能。从外部来看，仅使用临床参数（0.77 vs 0.67 曲线下面积 [AUC]，P < 0.001）和 DL 怀疑水平（AUC：0.77 vs 0.70，P = 0.006），它就超过了基线。在外部数据中，早期融合优于晚期融合（AUC 0.77 vs 0.73，P = 0.005）。多模态 AI 和放射科医生评估之间没有观察到显着的性能差距（内部：0.87 vs 0.88 AUC；外部：0.77 vs 0.75 AUC，均 P > 0.05）。多模态 AI（结合 DL 怀疑水平和临床参数）优于临床和仅 MRI用于 csPCa 检测的 AI。早期信息融合增强了多中心环境中人工智能的鲁棒性。合并病变体积并不能提高诊断效果。版权所有 © 2024 作者。由 Wolters Kluwer Health, Inc. 出版

Deep learning (DL) studies for the detection of clinically significant prostate cancer (csPCa) on magnetic resonance imaging (MRI) often overlook potentially relevant clinical parameters such as prostate-specific antigen, prostate volume, and age. This study explored the integration of clinical parameters and MRI-based DL to enhance diagnostic accuracy for csPCa on MRI.We retrospectively analyzed 932 biparametric prostate MRI examinations performed for suspected csPCa (ISUP ≥2) at 2 institutions. Each MRI scan was automatically analyzed by a previously developed DL model to detect and segment csPCa lesions. Three sets of features were extracted: DL lesion suspicion levels, clinical parameters (prostate-specific antigen, prostate volume, age), and MRI-based lesion volumes for all DL-detected lesions. Six multimodal artificial intelligence (AI) classifiers were trained for each combination of feature sets, employing both early (feature-level) and late (decision-level) information fusion methods. The diagnostic performance of each model was tested internally on 20% of center 1 data and externally on center 2 data (n = 529). Receiver operating characteristic comparisons determined the optimal feature combination and information fusion method and assessed the benefit of multimodal versus unimodal analysis. The optimal model performance was compared with a radiologist using PI-RADS.Internally, the multimodal AI integrating DL suspicion levels with clinical features via early fusion achieved the highest performance. Externally, it surpassed baselines using clinical parameters (0.77 vs 0.67 area under the curve [AUC], P < 0.001) and DL suspicion levels alone (AUC: 0.77 vs 0.70, P = 0.006). Early fusion outperformed late fusion in external data (0.77 vs 0.73 AUC, P = 0.005). No significant performance gaps were observed between multimodal AI and radiologist assessments (internal: 0.87 vs 0.88 AUC; external: 0.77 vs 0.75 AUC, both P > 0.05).Multimodal AI (combining DL suspicion levels and clinical parameters) outperforms clinical and MRI-only AI for csPCa detection. Early information fusion enhanced AI robustness in our multicenter setting. Incorporating lesion volumes did not enhance diagnostic efficacy.Copyright © 2024 The Author(s). Published by Wolters Kluwer Health, Inc.