SG-Fusion:基于Swin-Transformer和图卷积的多模态深度神经网络用于胶质瘤预后预测
SG-Fusion: A swin-transformer and graph convolution-based multi-modal deep neural network for glioma prognosis
DOI 原文链接
用sci-hub下载0
如无法下载,请从 Sci-Hub 选择可用站点尝试。
影响因子:6.2
分区:医学2区 Top / 计算机:人工智能2区 工程:生物医学2区 医学:信息2区
发表日期:2024 Nov
作者:
Minghan Fu, Ming Fang, Rayyan Azam Khan, Bo Liao, Zhanli Hu, Fang-Xiang Wu
DOI:
10.1016/j.artmed.2024.102972
keywords:
Contrastive learning; Genomic data; Histopathological images; Multimodal learning; Tumor diagnosis
摘要
从组织病理图像和基因组数据中提取的形态特征的融合在推动肿瘤的诊断、预后和分级方面具有重要意义。组织病理图像通过显微镜观察组织切片获得,能提供细胞结构和病理特征的宝贵信息;而基因组数据则提供肿瘤的基因表达和功能信息。这两种不同数据类型的融合对于全面理解肿瘤特性和发展至关重要。过去,许多研究依赖单一模态方法进行肿瘤诊断,但这些方法在充分利用多源信息方面存在局限。为了克服这些限制,研究者转向多模态方法,同时利用组织病理图像和基因组数据,这些方法更能捕捉肿瘤的多方面特征,提高诊断准确性。然而,现有的多模态方法在两种模态的特征提取和融合过程中存在一定的简化。本文提出了双支路神经网络——SG-Fusion。具体而言,针对组织病理模态,采用Swin-Transformer结构以捕获局部和全局特征,并引入对比学习以促进模型在表示空间中识别共同点与差异;而在基因组模态,构建了基于基因功能和表达水平相似性的图卷积网络。此外,模型还集成了交叉注意力模块以增强信息交互,并采用差异性正则化以提升模型的泛化能力。在癌症基因组图谱(TCGA)胶质瘤数据集上的验证明确显示,SG-Fusion模型在生存分析和肿瘤分级方面均优于单模态方法和现有多模态方法。
Abstract
The integration of morphological attributes extracted from histopathological images and genomic data holds significant importance in advancing tumor diagnosis, prognosis, and grading. Histopathological images are acquired through microscopic examination of tissue slices, providing valuable insights into cellular structures and pathological features. On the other hand, genomic data provides information about tumor gene expression and functionality. The fusion of these two distinct data types is crucial for gaining a more comprehensive understanding of tumor characteristics and progression. In the past, many studies relied on single-modal approaches for tumor diagnosis. However, these approaches had limitations as they were unable to fully harness the information from multiple data sources. To address these limitations, researchers have turned to multi-modal methods that concurrently leverage both histopathological images and genomic data. These methods better capture the multifaceted nature of tumors and enhance diagnostic accuracy. Nonetheless, existing multi-modal methods have, to some extent, oversimplified the extraction processes for both modalities and the fusion process. In this study, we presented a dual-branch neural network, namely SG-Fusion. Specifically, for the histopathological modality, we utilize the Swin-Transformer structure to capture both local and global features and incorporate contrastive learning to encourage the model to discern commonalities and differences in the representation space. For the genomic modality, we developed a graph convolutional network based on gene functional and expression level similarities. Additionally, our model integrates a cross-attention module to enhance information interaction and employs divergence-based regularization to enhance the model's generalization performance. Validation conducted on glioma datasets from the Cancer Genome Atlas unequivocally demonstrates that our SG-Fusion model outperforms both single-modal methods and existing multi-modal approaches in both survival analysis and tumor grading.