使用生物学信息可见神经网络为弥漫性大 B 细胞淋巴瘤患者建立可解释的生存模型。
An interpretable survival model for diffuse large B-cell lymphoma patients using a biologically informed visible neural network.
发表日期:2024 Dec
作者:
Jie Tan, Jiancong Xie, Jiarong Huang, Weizhen Deng, Hua Chai, Yuedong Yang
来源:
Computational and Structural Biotechnology Journal
摘要:
弥漫性大B细胞淋巴瘤(DLBCL)是非霍奇金淋巴瘤(NHL)最常见的亚型,具有高度异质性的特点。对其预后和遗传亚型的评估具有重要的临床意义。然而,现有的DLBCL预后模型主要基于转录组图谱,而遗传变异检测在临床实践中更常用。此外,目前基于聚类的亚型分型方法大多集中于突变频率高的基因,对DLBCL的异质性解释不充分。在这里,我们提出了 VNNSurv(https://bio-web1.nscc-gz.cn/app/VNNSurv),这是一种基于生物学信息可见神经网络(VNN)的 DLBCL 患者生存模型。 VNNSurv 在交叉验证集(HMRN 队列,n = 928)上的平均 C 指数为 0.72,优于基线方法。 VNNSurv 卓越的可解释性有助于识别最有影响力的基因及其对患者结果产生作用的潜在途径。当仅使用 30 个影响最大的基因作为遗传输入时,VNNSurv 的整体性能得到改善,外部 TCGA 队列 (n = 48) 的 C 指数达到 0.70。利用这些高影响基因,包括 16 个改变频率低 (<5%) 的基因,我们设计了基于遗传的预后指数 (GPI),用于风险分层和亚型识别方法。我们根据国际预后指数(IPI)将患者组分为三个风险等级,具有显着的预后差异。此外,与基于聚类的方法相比,定义的亚型表现出更高的预后一致性。从广义上讲,VNNSurv 是一种有价值的 DLBCL 生存模型。其高可解释性对于精准医学具有重要价值,并且其框架可扩展到其他疾病。© 2024 作者。
Diffuse large B-cell lymphoma (DLBCL) is the most common subtype of non-Hodgkin lymphoma (NHL) and is characterized by high heterogeneity. Assessment of its prognosis and genetic subtyping hold significant clinical implications. However, existing DLBCL prognostic models are mainly based on transcriptomic profiles, while genetic variation detection is more commonly used in clinical practice. In addition, current clustering-based subtyping methods mostly focus on genes with high mutation frequencies, providing insufficient explanations for the heterogeneity of DLBCL. Here, we proposed VNNSurv (https://bio-web1.nscc-gz.cn/app/VNNSurv), a survival model for DLBCL patients based on a biologically informed visible neural network (VNN). VNNSurv achieved an average C-index of 0.72 on the cross-validation set (HMRN cohort, n = 928), outperforming the baseline methods. The remarkable interpretability of VNNSurv facilitated the identification of the most impactful genes and the underlying pathways through which they act on patient outcomes. When only the 30 highest-impact genes were used as genetic input, the overall performance of VNNSurv improved, and a C-index of 0.70 was achieved on the external TCGA cohort (n = 48). Leveraging these high-impact genes, including 16 genes with low (<5 %) alteration frequencies, we devised a genetic-based prognostic index (GPI) for risk stratification and a subtype identification method. We stratified the patient group according to the International Prognostic Index (IPI) into three risk grades with significant prognostic differences. Furthermore, the defined subtypes exhibited greater prognostic consistency than clustering-based methods. Broadly, VNNSurv is a valuable DLBCL survival model. Its high interpretability has significant value for precision medicine, and its framework is scalable to other diseases.© 2024 The Authors.