研究动态
Articles below are published ahead of final publication in an issue. Please cite articles in the following format: authors, (year), title, journal, DOI.

单细胞蛋白质生物学的情境人工智能模型。

Contextual AI models for single-cell protein biology.

发表日期:2024 Aug
作者: Michelle M Li, Yepeng Huang, Marissa Sumathipala, Man Qing Liang, Alberto Valdeolivas, Ashwin N Ananthakrishnan, Katherine Liao, Daniel Marbach, Marinka Zitnik
来源: NATURE METHODS

摘要:

了解蛋白质功能和开发分子疗法需要破译蛋白质起作用的细胞类型以及蛋白质之间的相互作用。然而,对跨生物环境的蛋白质相互作用进行建模对于现有算法来说仍然具有挑战性。在这里,我们介绍 PINNACLE,一种几何深度学习方法,可生成上下文感知的蛋白质表示。 PINNACLE 利用多器官单细胞图谱,学习情境化蛋白质相互作用网络,从 24 个组织的 156 种细胞类型环境中生成 394,760 个蛋白质表示。 PINNACLE 的嵌入空间反映了细胞和组织组织,从而实现组织层次结构的零样本检索。预训练的蛋白质表示可以适应下游任务:增强基于 3D 结构的表示以解决免疫肿瘤蛋白质相互作用,以及研究药物在不同细胞类型中的作用。 PINNACLE 在指定类风湿关节炎和炎症性肠病的治疗靶点方面优于最先进的模型,并且比无背景模型具有更高的预测能力来精确定位细胞类型背景。 PINNACLE 能够根据其运行环境调整其输出,为生物学中大规模的特定环境预测铺平了道路。© 2024。作者。
Understanding protein function and developing molecular therapies require deciphering the cell types in which proteins act as well as the interactions between proteins. However, modeling protein interactions across biological contexts remains challenging for existing algorithms. Here we introduce PINNACLE, a geometric deep learning approach that generates context-aware protein representations. Leveraging a multiorgan single-cell atlas, PINNACLE learns on contextualized protein interaction networks to produce 394,760 protein representations from 156 cell type contexts across 24 tissues. PINNACLE's embedding space reflects cellular and tissue organization, enabling zero-shot retrieval of the tissue hierarchy. Pretrained protein representations can be adapted for downstream tasks: enhancing 3D structure-based representations for resolving immuno-oncological protein interactions, and investigating drugs' effects across cell types. PINNACLE outperforms state-of-the-art models in nominating therapeutic targets for rheumatoid arthritis and inflammatory bowel diseases and pinpoints cell type contexts with higher predictive capability than context-free models. PINNACLE's ability to adjust its outputs on the basis of the context in which it operates paves the way for large-scale context-specific predictions in biology.© 2024. The Author(s).