SG-Fusion：一种用于神经胶质瘤预后的基于 swin-transformer 和图卷积的多模态深度神经网络。

SG-Fusion: A swin-transformer and graph convolution-based multi-modal deep neural network for glioma prognosis.

Original text

发表日期：2024 Aug 31

作者： Minghan Fu, Ming Fang, Rayyan Azam Khan, Bo Liao, Zhanli Hu, Fang-Xiang Wu

来源： ARTIFICIAL INTELLIGENCE IN MEDICINE

摘要：

从组织病理学图像和基因组数据中提取的形态学属性的整合对于推进肿瘤诊断、预后和分级具有重要意义。组织病理学图像是通过组织切片的显微镜检查获得的，为细胞结构和病理特征提供了有价值的见解。另一方面，基因组数据提供有关肿瘤基因表达和功能的信息。这两种不同数据类型的融合对于更全面地了解肿瘤特征和进展至关重要。过去，许多研究依赖于单一模式的肿瘤诊断方法。然而，这些方法存在局限性，因为它们无法充分利用多个数据源的信息。为了解决这些限制，研究人员转向同时利用组织病理学图像和基因组数据的多模式方法。这些方法更好地捕捉肿瘤的多方面性质并提高诊断准确性。尽管如此，现有的多模态方法在某种程度上过度简化了模态和融合过程的提取过程。在这项研究中，我们提出了一种双分支神经网络，即 SG-Fusion。具体来说，对于组织病理学模式，我们利用 Swin-Transformer 结构来捕获局部和全局特征，并结合对比学习来鼓励模型辨别表示空间中的共性和差异。对于基因组模式，我们开发了一个基于基因功能和表达水平相似性的图卷积网络。此外，我们的模型集成了交叉注意模块来增强信息交互，并采用基于散度的正则化来增强模型的泛化性能。对癌症基因组图谱中的神经胶质瘤数据集进行的验证明确表明，我们的 SG-Fusion 模型在生存分析和肿瘤分级方面均优于单模态方法和现有的多模态方法。版权所有 © 2024 作者。由 Elsevier B.V. 出版。保留所有权利。

The integration of morphological attributes extracted from histopathological images and genomic data holds significant importance in advancing tumor diagnosis, prognosis, and grading. Histopathological images are acquired through microscopic examination of tissue slices, providing valuable insights into cellular structures and pathological features. On the other hand, genomic data provides information about tumor gene expression and functionality. The fusion of these two distinct data types is crucial for gaining a more comprehensive understanding of tumor characteristics and progression. In the past, many studies relied on single-modal approaches for tumor diagnosis. However, these approaches had limitations as they were unable to fully harness the information from multiple data sources. To address these limitations, researchers have turned to multi-modal methods that concurrently leverage both histopathological images and genomic data. These methods better capture the multifaceted nature of tumors and enhance diagnostic accuracy. Nonetheless, existing multi-modal methods have, to some extent, oversimplified the extraction processes for both modalities and the fusion process. In this study, we presented a dual-branch neural network, namely SG-Fusion. Specifically, for the histopathological modality, we utilize the Swin-Transformer structure to capture both local and global features and incorporate contrastive learning to encourage the model to discern commonalities and differences in the representation space. For the genomic modality, we developed a graph convolutional network based on gene functional and expression level similarities. Additionally, our model integrates a cross-attention module to enhance information interaction and employs divergence-based regularization to enhance the model's generalization performance. Validation conducted on glioma datasets from the Cancer Genome Atlas unequivocally demonstrates that our SG-Fusion model outperforms both single-modal methods and existing multi-modal approaches in both survival analysis and tumor grading.Copyright © 2024 The Authors. Published by Elsevier B.V. All rights reserved.