mosGraphGen:一种生成多组学信号图的新颖工具,以促进集成和可解释的图 AI 模型开发。
mosGraphGen: a novel tool to generate multi-omic signaling graphs to facilitate integrative and interpretable graph AI model development.
发表日期:2024 May 18
作者:
Heming Zhang, Dekang Cao, Zirui Chen, Xiuyuan Zhang, Yixin Chen, Cole Sessions, Carlos Cruchaga, Philip Payne, Guangfu Li, Michael Province, Fuhai Li
来源:
GENOMICS PROTEOMICS & BIOINFORMATICS
摘要:
多组学数据,即基因组学、表观基因组学、转录组学、蛋白质组学,从多层次、多视角表征细胞复杂信号系统,并提供复杂细胞信号通路的整体视图。然而,整合和解释多组学数据仍然具有挑战性。图神经网络(GNN)人工智能模型已被广泛用于分析图结构数据集,并且非常适合综合多组学数据分析,因为它们可以自然地将多组学数据集成和表示为具有生物学意义的多级信号图并解释按节点和边缘排序分析的多组学数据,用于信号流/级联推理。然而,对于图 AI 模型开发人员来说,预先分析多组学数据并将其转换为单个样本的图结构数据,然后将其直接输入到图 AI 模型中并非易事。为了解决这一挑战,我们开发了 mosGraphGen(多组学信号图生成器),这是一种新颖的计算工具,通过将多组学数据映射到具有生物学意义的多级背景信号网络来生成单个样本的多组学信号图。借助 mosGraphGen,AI 模型开发人员可以使用这些 mos-graph 直接应用和评估他们的模型。我们使用癌症和阿尔茨海默病 (AD) 样本的多组学数据集评估了 mosGraphGen。 mosGraphGen 的代码是开源的,可通过 GitHub 公开获取:https://github.com/Multi-OmicGraphBuilder/mosGraphGen。
Multi-omic data, i.e., genomics, epigenomics, transcriptomics, proteomics, characterize cellular complex signaling systems from multi-level and multi-view and provide a holistic view of complex cellular signaling pathways. However, it remains challenging to integrate and interpret multi-omics data. Graph neural network (GNN) AI models have been widely used to analyze graph-structure datasets and are ideal for integrative multi-omics data analysis because they can naturally integrate and represent multi-omics data as a biologically meaningful multi-level signaling graph and interpret multi-omics data by node and edge ranking analysis for signaling flow/cascade inference. However, it is non-trivial for graph-AI model developers to pre-analyze multi-omics data and convert them into graph-structure data for individual samples, which can be directly fed into graph-AI models. To resolve this challenge, we developed mosGraphGen (multi-omics signaling graph generator), a novel computational tool that generates multi-omics signaling graphs of individual samples by mapping the multi-omics data onto a biologically meaningful multi-level background signaling network. With mosGraphGen, AI model developers can directly apply and evaluate their models using these mos-graphs. We evaluated the mosGraphGen using both multi-omics datasets of cancer and Alzheimer's disease (AD) samples. The code of mosGraphGen is open-source and publicly available via GitHub: https://github.com/Multi-OmicGraphBuilder/mosGraphGen.