一种基于Wasserstein距离和图的癌症亚型发现的综合方法。
An Integrated Method Based on Wasserstein Distance and Graph for Cancer Subtype Discovery.
发表日期:2023 Aug 01
作者:
Qingqing Cao, Jianping Zhao, Haiyun Wang, Qi Guan, Chunhou Zheng
来源:
Ieee Acm T Comput Bi
摘要:
由于癌症发生机制在不同组学水平上的复杂性,有必要找到一种全面的方法来准确区分和发现癌症亚型以进行癌症治疗。本论文提出了一种新的癌症多组学亚型识别方法,该方法基于基于Wasserstein距离和图自编码器(WVGMO)的变分自编码器。该方法依赖于两个主要模型。第一个模型是基于Wasserstein距离(WVAE)测量的变分自编码器(WVAE),用于提取每种组学数据类型的潜在空间信息。第二个模型是带有二阶接近度的图自编码器(GAE),它具有保留多组学数据的拓扑结构信息和特征信息的能力。然后,通过k均值聚类方法对癌症亚型进行识别。基于TCGA的四种组学数据,在七种不同的癌症上进行了大量实验。结果表明,WVGMO提供了与大多数先进的综合方法相当甚至更好的结果。
Due to the complexity of cancer pathogenesis at different omics levels, it is necessary to find a comprehensive method to accurately distinguish and find cancer subtypes for cancer treatment. In this paper, we proposed a new cancer multi-omics subtype identification method, which is based on variational autoencoder measured by Wasserstein distance and graph autoencoder (WVGMO). This method depends on two foremost models. The first model is a variational autoencoder measured by Wasserstein distance (WVAE), which is used to extract potential spatial information of each omic data type. The second model is the graph autoencoder (GAE) with the second-order proximity. It has the capability to retain the topological structure information and feature information of the multi-omics data. And then, the identification of cancer subtypes via k-means clustering. Extensive experiments were conducted on seven different cancers based on four omics data from TCGA. The results show that WVGMO provides equivalent or even better results than the most of advanced synthesis methods.