GlycoSLASH: 通过使用谱聚类和库搜索从多个相关的LC-MS/MS数据集中同时识别糖肽。
GlycoSLASH: Concurrent Glycopeptide Identification from Multiple Related LC-MS/MS Data Sets by Using Spectral Clustering and Library Searching.
发表日期:2023 Feb 21
作者:
Sujun Li, Jianhui Zhu, David M Lubman, He Zhou, Haixu Tang
来源:
JOURNAL OF PROTEOME RESEARCH
摘要:
液相色谱串联质谱技术在牵涉到数百个疾病和对照样本的大规模糖蛋白质组学研究中得到广泛应用。在这样的数据中,糖肽识别软件(例如商业软件Byonic)会分析个别数据集,并不利用相关数据集中出现的糖肽冗余谱。在这里,我们提出了一种新型的糖肽鉴定方法,通过谱聚类和谱库搜索在多个相关的糖蛋白质组学数据集中进行鉴定。在两个大规模的糖蛋白质组学数据集上的评估表明,与仅使用Byonic对个别数据集进行糖肽鉴定相比,我们的同步方法可以识别105%-224%更多的谱图作为糖肽。糖肽鉴定的改进也使我们能够发现肝细胞癌患者蛋白质糖基化的几个潜在生物标志物。
Liquid chromatography coupled with tandem mass spectrometry is commonly adopted in large-scale glycoproteomic studies involving hundreds of disease and control samples. The software for glycopeptide identification in such data (e.g., the commercial software Byonic) analyzes the individual data set and does not exploit the redundant spectra of glycopeptides presented in the related data sets. Herein, we present a novel concurrent approach for glycopeptide identification in multiple related glycoproteomic data sets by using spectral clustering and spectral library searching. The evaluation on two large-scale glycoproteomic data sets showed that the concurrent approach can identify 105%-224% more spectra as glycopeptides compared to the glycopeptide identification on individual data sets using Byonic alone. The improvement of glycopeptide identification also enabled the discovery of several potential biomarkers of protein glycosylations in hepatocellular carcinoma patients.