MBFusion：用于癌症诊断和预后的多模态平衡融合和多任务学习。

MBFusion: Multi-modal balanced fusion and multi-task learning for cancer diagnosis and prognosis.

Original text

发表日期：2024 Aug 23

作者： Ziye Zhang, Wendong Yin, Shijin Wang, Xiaorou Zheng, Shoubin Dong

来源： COMPUTERS IN BIOLOGY AND MEDICINE

摘要：

病理图像和分子组学是预测诊断和预后的重要信息。两种异构模态数据包含互补信息，两种模态的有效融合可以更好地揭示癌症的复杂机制。然而，由于表示学习方法的不同，不同任务中不同模态的表达强度差异很大，使得很多多模态融合并不能达到最好的效果。本文提出MBFusion，通过多模态平衡融合实现诊断和预后预测等多项任务。 MBFusion框架使用两种专门构建的图卷积网络来提取分子组学数据的特征，并使用ResNet来提取病理图像数据的特征，并通过使用注意力和聚类来保留重要的深层特征，有效地提高了两种特征具有代表性，使他们的表达能力均衡且具有可比性。然后通过交叉注意力 Transformer 融合这两个模态数据的特征，并利用融合的特征通过多任务学习来学习癌症亚型分类和生存分析这两个任务。在本文中，MBFusion 和其他最先进的方法在两个公共癌症数据集上进行了比较，MBFusion 通过三种评估指标显示出高达 10.1% 的改进。在消融实验中，MBFusion 探索了每个模态数据和每个框架模块对性能的贡献。此外，还详细解释了 MBFusion 的可解释性，以显示应用价值。版权所有 © 2024 Elsevier Ltd. 保留所有权利。

Pathological images and molecular omics are important information for predicting diagnosis and prognosis. The two kinds of heterogeneous modal data contain complementary information, and the effective fusion of the two modals can better reveal the complex mechanisms of cancer. However, due to the different representation learning methods, the expression strength of different modals in different tasks varies greatly, so that many multimodal fusions do not achieve the best results. In this paper, MBFusion is proposed, to achieve multiple tasks such as prediction of diagnosis and prognosis through multi-modal balanced fusion. The MBFusion framework uses two kinds of specially constructed graph convolutional network to extract the features of molecular omics data, and uses ResNet to extract the features of pathological image data and retain important deep features by using attention and clustering, which effectively improves both kinds of the features representation, making their expressive ability balanced and comparable. The features of these two modal data are then fused through cross-attention Transformer, and the fused features are used to learn both tasks of cancer subtype classification and survival analysis by using multi-task learning. In this paper, MBFusion and other state of the art methods are compared on two public cancer datasets, and MBFusion shows an improvement of up to 10.1% by three kinds of evaluation metrics. In the ablation experiment, MBFusion explores the contribution of each modal data and each framework module to the performance. Furthermore, the interpretability of MBFusion is explained in detail to show the value of application.Copyright © 2024 Elsevier Ltd. All rights reserved.