研究动态
Articles below are published ahead of final publication in an issue. Please cite articles in the following format: authors, (year), title, journal, DOI.

来自转录组学的蛋白质组范围拷贝数估计。

Proteome-wide copy-number estimation from transcriptomics.

发表日期:2024 Sep 27
作者: Andrew J Sweatt, Cameron D Griffiths, Sarah M Groves, B Bishal Paudel, Lixin Wang, David F Kashatus, Kevin A Janes
来源: Molecular Systems Biology

摘要:

蛋白质拷贝数限制了调控网络的系统级特性,但与 RNA-seq 相比,比例蛋白质组数据仍然稀缺。我们使用定量蛋白质组学和转录组学中 369 个细胞系中 4366 个基因的最佳可用数据,将 mRNA 与蛋白质进行统计关联。该方法从蛋白质的中位拷贝数开始,并分层附加 mRNA-蛋白质和 mRNA-mRNA 依赖性,以定义将 mRNA 与蛋白质联系起来的最佳基因特异性模型。对于数十种细胞系和初级样品,这些来自 mRNA 的蛋白质推论胜过严格的无效模型、基于计数的蛋白质丰度存储库、经验性 mRNA 与蛋白质比率以及蛋白质组 DREAM 挑战获胜者。最佳的 mRNA 与蛋白质关系捕获了生物过程以及数百种已知的蛋白质-蛋白质复合物,表明了机械关系。我们使用该方法从由蛋白质推断参数化的 1489 个系统生物学感染模型中确定柯萨奇病毒 B3 易感性的病毒受体丰度阈值。当应用于乳腺癌的 796 个 RNA 序列图谱时,推断的拷贝数估计值总共对 26-29% 的管腔肿瘤进行了重新分类。通过采用以基因为中心的不同生物背景下 mRNA-蛋白质共变的视角,我们获得了与当代蛋白质组学的技术再现性相当的准确性。© 2024。作者。
Protein copy numbers constrain systems-level properties of regulatory networks, but proportional proteomic data remain scarce compared to RNA-seq. We related mRNA to protein statistically using best-available data from quantitative proteomics and transcriptomics for 4366 genes in 369 cell lines. The approach starts with a protein's median copy number and hierarchically appends mRNA-protein and mRNA-mRNA dependencies to define an optimal gene-specific model linking mRNAs to protein. For dozens of cell lines and primary samples, these protein inferences from mRNA outmatch stringent null models, a count-based protein-abundance repository, empirical mRNA-to-protein ratios, and a proteogenomic DREAM challenge winner. The optimal mRNA-to-protein relationships capture biological processes along with hundreds of known protein-protein complexes, suggesting mechanistic relationships. We use the method to identify a viral-receptor abundance threshold for coxsackievirus B3 susceptibility from 1489 systems-biology infection models parameterized by protein inference. When applied to 796 RNA-seq profiles of breast cancer, inferred copy-number estimates collectively re-classify 26-29% of luminal tumors. By adopting a gene-centered perspective of mRNA-protein covariation across different biological contexts, we achieve accuracies comparable to the technical reproducibility of contemporary proteomics.© 2024. The Author(s).