一个机器学习工具,用通常可获得的[18F]FDG-PET/CT参数改进对非小细胞肺癌纵隔淋巴结转移的预测能力。
A machine learning tool to improve prediction of mediastinal lymph node metastases in non-small cell lung cancer using routinely obtainable [18F]FDG-PET/CT parameters.
发表日期:2023 Feb 23
作者:
Julian M M Rogasch, Liza Michaels, Georg L Baumgärtner, Nikolaj Frost, Jens-Carsten Rückert, Jens Neudecker, Sebastian Ochsenreither, Manuela Gerhold, Bernd Schmidt, Paul Schneider, Holger Amthauer, Christian Furth, Tobias Penzkofer
来源:
Eur J Nucl Med Mol I
摘要:
在非小细胞肺癌(NSCLC)患者中,[18F]FDG-PET/CT在术前淋巴结(LN)分期中的准确性受到假阳性结果的限制。我们的目标是评估利用常规获取的变量进行机器学习,以提高标准视觉图像评估的准确性。对491名NSCLC患者进行了单中心前瞻性分析,使用类比PET/CT扫描仪(训练+测试队列,n=385)或数字扫描仪(验证队列,n=106)进行术前[18F]FDG-PET/CT。收集了40个临床变量、肿瘤特征和图像变量(如原发肿瘤和LN SUVmax和大小)。比较了不同机器学习方法的特征选取和N0/1和N2/3疾病分类组合。采用十折嵌套交叉验证来得出10个测试折线下ROC曲线的平均面积("测试AUC")以及验证队列的AUC。参考标准是来自跨学科共识的最终N阶段(96%的N2/3 LNs的组织学结果)。190名患者(39%;训练+测试37%;验证46%;p=0.09)患有N2/3疾病。基于测试AUC为0.91(95%置信区间,0.87-0.94),选择了一个具有10个特征的梯度提升分类器(GBM)作为最终模型。验证AUC为0.94(0.89-0.98)。在目标灵敏度约为90%的情况下,GBM的测试/验证准确性为0.78/0.87。这比基于“纵隔LN摄取>纵隔”的准确性(0.7/0.75;每个p<0.05)或PET/CT综合标准(PET阳性和/或LN短轴直径> 10 mm; 0.68/0.75;每个p<0.001)高得多。两个扫描仪之间PET图像的统一对SUVmax和LN的视觉评估有影响,但不会降低GBM的AUC。利用[18F]FDG-PET/CT的常规可得变量,机器学习模型提高了纵隔LN分期的准确性,相对于已建立的视觉评估标准。一个实现该模型的网络应用程序已经推出。©2023.作者。
In patients with non-small cell lung cancer (NSCLC), accuracy of [18F]FDG-PET/CT for pretherapeutic lymph node (LN) staging is limited by false positive findings. Our aim was to evaluate machine learning with routinely obtainable variables to improve accuracy over standard visual image assessment.Monocentric retrospective analysis of pretherapeutic [18F]FDG-PET/CT in 491 consecutive patients with NSCLC using an analog PET/CT scanner (training + test cohort, n = 385) or digital scanner (validation, n = 106). Forty clinical variables, tumor characteristics, and image variables (e.g., primary tumor and LN SUVmax and size) were collected. Different combinations of machine learning methods for feature selection and classification of N0/1 vs. N2/3 disease were compared. Ten-fold nested cross-validation was used to derive the mean area under the ROC curve of the ten test folds ("test AUC") and AUC in the validation cohort. Reference standard was the final N stage from interdisciplinary consensus (histological results for N2/3 LNs in 96%).N2/3 disease was present in 190 patients (39%; training + test, 37%; validation, 46%; p = 0.09). A gradient boosting classifier (GBM) with 10 features was selected as the final model based on test AUC of 0.91 (95% confidence interval, 0.87-0.94). Validation AUC was 0.94 (0.89-0.98). At a target sensitivity of approx. 90%, test/validation accuracy of the GBM was 0.78/0.87. This was significantly higher than the accuracy based on "mediastinal LN uptake > mediastinum" (0.7/0.75; each p < 0.05) or combined PET/CT criteria (PET positive and/or LN short axis diameter > 10 mm; 0.68/0.75; each p < 0.001). Harmonization of PET images between the two scanners affected SUVmax and visual assessment of the LNs but did not diminish the AUC of the GBM.A machine learning model based on routinely available variables from [18F]FDG-PET/CT improved accuracy in mediastinal LN staging compared to established visual assessment criteria. A web application implementing this model was made available.© 2023. The Author(s).