研究动态
Articles below are published ahead of final publication in an issue. Please cite articles in the following format: authors, (year), title, journal, DOI.

癌症多模态数据的综合分析将 COPS5 确定为弥漫性大 B 细胞淋巴瘤的新型生物标志物。

Integrative analysis of cancer multimodality data identifying COPS5 as a novel biomarker of diffuse large B-cell lymphoma.

发表日期:2024
作者: Yutong Dai, Jingmei Li, Keita Yamamoto, Susumu Goyama, Martin Loza, Sung-Joon Park, Kenta Nakai
来源: Frontiers in Genetics

摘要:

预防、诊断和治疗疾病需要准确的临床生物标志物,这仍然具有挑战性。最近,先进的计算方法加速了从高维多模态数据中发现有前途的生物标志物。尽管机器学习方法对研究领域做出了巨大贡献,但处理数据稀疏性(这在研究环境中并不罕见)仍然是一个问题,因为它会导致在存在丢失信息的情况下导致可解释性和性能有限。在这里,我们提出了一种新颖的管道,集成了联合非负矩阵分解(JNMF),识别稀疏高维异质数据中的关键特征,以及生物通路分析,通过检测激活的信号通路来解释特征的功能。通过将我们的流程应用于大规模公共癌症数据集,我们确定了与特定癌症类型相关的基因组特征集,作为 JNMF 的通用模式模块 (CPM)。我们进一步检测到 COPS5 作为与弥漫性大 B 细胞淋巴瘤 (DLBCL) 相关通路的潜在上游调节因子。 COPS5与已知的DLBCL标记基因MYC、TP53和BCL2共过表达,其高表达与DLBCL患者较低的生存概率相关。使用 CRISPR-Cas9 系统,我们证实了 COPS5 的肿瘤生长作用,这表明它是 DLBCL 的一种新型预后生物标志物。我们的结果强调,整合多个高维数据并有效地将它们分解为可解释的维度可以揭示隐藏的生物学重要性,从而增强临床生物标志物的发现。版权所有 © 2024 Dai、Li、Yamamoto、Goyama、Loza、Park 和 Nakai。
Preventing, diagnosing, and treating diseases requires accurate clinical biomarkers, which remains challenging. Recently, advanced computational approaches have accelerated the discovery of promising biomarkers from high-dimensional multimodal data. Although machine-learning methods have greatly contributed to the research fields, handling data sparseness, which is not unusual in research settings, is still an issue as it leads to limited interpretability and performance in the presence of missing information. Here, we propose a novel pipeline integrating joint non-negative matrix factorization (JNMF), identifying key features within sparse high-dimensional heterogeneous data, and a biological pathway analysis, interpreting the functionality of features by detecting activated signaling pathways. By applying our pipeline to large-scale public cancer datasets, we identified sets of genomic features relevant to specific cancer types as common pattern modules (CPMs) of JNMF. We further detected COPS5 as a potential upstream regulator of pathways associated with diffuse large B-cell lymphoma (DLBCL). COPS5 exhibited co-overexpression with MYC, TP53, and BCL2, known DLBCL marker genes, and its high expression was correlated with a lower survival probability of DLBCL patients. Using the CRISPR-Cas9 system, we confirmed the tumor growth effect of COPS5, which suggests it as a novel prognostic biomarker for DLBCL. Our results highlight that integrating multiple high-dimensional data and effectively decomposing them to interpretable dimensions unravels hidden biological importance, which enhances the discovery of clinical biomarkers.Copyright © 2024 Dai, Li, Yamamoto, Goyama, Loza, Park and Nakai.