基于超声弹性造影术的六种机器学习算法预测乳头状甲状腺癌中BRAFV600E突变。
Predicting BRAFV600E mutations in papillary thyroid carcinoma using six machine learning algorithms based on ultrasound elastography.
发表日期:2023 Aug 03
作者:
Enock Adjei Agyekum, Yu-Guo Wang, Fei-Ju Xu, Debora Akortia, Yong-Zhen Ren, Kevoyne Hakeem Chambers, Xian Wang, Jenny Olalia Taupa, Xiao-Qin Qian
来源:
Cell Death & Disease
摘要:
最常见的BRAF基因突变是在核苷酸1796位点发生胸腺嘧啶(T)到腺嘌呤(A)错义突变(T1796A,V600E)。BRAFV600E基因编码一种蛋白依赖性激酶(PDK),这是线粒体活化蛋白激酶途径的关键组成部分,对细胞增殖、分化和死亡的调控至关重要。BRAFV600E突变导致PDK的异常不正确激活和持续激活,进而导致甲状腺乳头状癌细胞的异常增殖和分化。基于弹性造影超声(US)放射学特征,本研究旨在创建和验证六种不同的机器学习算法,以在PTC患者手术前预测BRAFV600E突变。本研究使用了138名PTC患者的常规US应变弹性成像数据。将患者分为两组:无BRAFV600E突变组(n=75)和有BRAFV600E突变组(n=63)。患者随机分配到两个数据集中:训练集(70%)和验证集(30%)。从应变弹性US图像中获取了共479个放射学特征。使用Pearson相关系数(PCC)和递归特征消除(RFE)与分层十折交叉验证法来减少特征数量。基于选定的放射学特征,比较了支持向量机线性核(SVM_L)、支持向量机径向基函数核(SVM_RBF)、逻辑回归(LR)、朴素贝叶斯(NB)、K最近邻(KNN)和线性判别分析(LDA)这六种机器学习算法用于预测BRAFV600E的可能性。使用准确率(ACC)、曲线下面积(AUC)、敏感性(SEN)、特异性(SPEC)、阳性预测值(PPV)、阴性预测值(NPV)、决策曲线分析(DCA)和校准曲线来评估这些机器学习算法的性能。① 机器学习算法的诊断性能依赖于27个放射学特征。② NB、KNN、LDA、LR、SVM_L和SVM_RBF的AUC分别为0.80(95%置信区间[CI]:0.65-0.91)、0.87(95% CI 0.73-0.95)、0.91(95% CI 0.79-0.98)、0.92(95% CI 0.80-0.98)、0.93(95% CI 0.80-0.98)和0.98(95% CI 0.88-1.00)。③ PTC患者中有BRAFV600E突变和无BRAFV600E突变的组别在回声度、垂直和水平直径比以及弹性方面存在显著差异。基于US弹性造影放射学特征的机器学习算法能够预测PTC患者中BRAFV600E的可能性,可以帮助医生确定PTC患者的风险。在这六种机器学习算法中,支持向量机径向基函数(SVM_RBF)达到了最好的准确率(0.93)、AUC(0.98)、敏感性(0.95)、特异性(0.90)、阳性预测值(0.91)和阴性预测值(0.95)。© 2023. Springer Nature Limited.
The most common BRAF mutation is thymine (T) to adenine (A) missense mutation in nucleotide 1796 (T1796A, V600E). The BRAFV600E gene encodes a protein-dependent kinase (PDK), which is a key component of the mitogen-activated protein kinase pathway and essential for controlling cell proliferation, differentiation, and death. The BRAFV600E mutation causes PDK to be activated improperly and continuously, resulting in abnormal proliferation and differentiation in PTC. Based on elastography ultrasound (US) radiomic features, this study seeks to create and validate six distinct machine learning algorithms to predict BRAFV6OOE mutation in PTC patients prior to surgery. This study employed routine US strain elastography image data from 138 PTC patients. The patients were separated into two groups: those who did not have the BRAFV600E mutation (n = 75) and those who did have the mutation (n = 63). The patients were randomly assigned to one of two data sets: training (70%), or validation (30%). From strain elastography US images, a total of 479 radiomic features were retrieved. Pearson's Correlation Coefficient (PCC) and Recursive Feature Elimination (RFE) with stratified tenfold cross-validation were used to decrease the features. Based on selected radiomic features, six machine learning algorithms including support vector machine with the linear kernel (SVM_L), support vector machine with radial basis function kernel (SVM_RBF), logistic regression (LR), Naïve Bayes (NB), K-nearest neighbors (KNN), and linear discriminant analysis (LDA) were compared to predict the possibility of BRAFV600E. The accuracy (ACC), the area under the curve (AUC), sensitivity (SEN), specificity (SPEC), positive predictive value (PPV), negative predictive value (NPV), decision curve analysis (DCA), and calibration curves of the machine learning algorithms were used to evaluate their performance. ① The machine learning algorithms' diagnostic performance depended on 27 radiomic features. ② AUCs for NB, KNN, LDA, LR, SVM_L, and SVM_RBF were 0.80 (95% confidence interval [CI]: 0.65-0.91), 0.87 (95% CI 0.73-0.95), 0.91(95% CI 0.79-0.98), 0.92 (95% CI 0.80-0.98), 0.93 (95% CI 0.80-0.98), and 0.98 (95% CI 0.88-1.00), respectively. ③ There was a significant difference in echogenicity,vertical and horizontal diameter ratios, and elasticity between PTC patients with BRAFV600E and PTC patients without BRAFV600E. Machine learning algorithms based on US elastography radiomic features are capable of predicting the likelihood of BRAFV600E in PTC patients, which can assist physicians in identifying the risk of BRAFV600E in PTC patients. Among the six machine learning algorithms, the support vector machine with radial basis function (SVM_RBF) achieved the best ACC (0.93), AUC (0.98), SEN (0.95), SPEC (0.90), PPV (0.91), and NPV (0.95).© 2023. Springer Nature Limited.