MAGICAL：一种多类别分类器，用于利用蛋白质-蛋白质相互作用网络预测合成致死和存活相互作用

MAGICAL: A multi-class classifier to predict synthetic lethal and viable interactions using protein-protein interaction network

DOI 原文链接

用sci-hub下载

PLoS Computational Biology

影响因子:3.6

分区:生物学2区 / 生化研究方法2区数学与计算生物学2区

发表日期:2024 Aug

作者: Anubha Dey, Suresh Mudunuri, Manjari Kiran

DOI: 10.1371/journal.pcbi.1012336

摘要

合成致死（SL）和合成存活（SV）是癌症靶向治疗中常研究的遗传相互作用。在SL中，抑制任一基因不会影响癌细胞存活，但同时抑制两者则导致致死表型。在SV中，抑制易感基因使癌细胞变得虚弱；而抑制合作基因则能救援并促进细胞存活。虽然采用了多种低通量和高通量实验方法来识别SL和SV，但这些方法耗时且成本高。计算预测SL的工具多采用统计学和机器学习方法，几乎所有机器学习工具都是二分类模型，仅识别SL对。最重要的是，目前已知的描述和区分SL与SV的属性有限。我们开发了MAGICAL（基于算法学习的癌症遗传相互作用多类别方法），一种基于随机森林的多类别机器学习模型，用于预测遗传相互作用。利用蛋白质-蛋白质相互作用的网络属性作为特征，分类SL与SV。模型在训练数据集（CGIdb、BioGRID和SynLethDB）上达到约80%的准确率，并在DepMap及其他实验报告数据集上表现良好。在所有网络属性中，最短路径、平均邻居数、平均中介度、平均三角形数和粘附性具有显著的判别能力。MAGICAL是首个多类别模型，能识别合成致死和存活相互作用的判别特征。其预测的SL和SV具有优于现有任何二分类器的准确性和精确性。

Abstract

Synthetic lethality (SL) and synthetic viability (SV) are commonly studied genetic interactions in the targeted therapy approach in cancer. In SL, inhibiting either of the genes does not affect the cancer cell survival, but inhibiting both leads to a lethal phenotype. In SV, inhibiting the vulnerable gene makes the cancer cell sick; inhibiting the partner gene rescues and promotes cell viability. Many low and high-throughput experimental approaches have been employed to identify SLs and SVs, but they are time-consuming and expensive. The computational tools for SL prediction involve statistical and machine-learning approaches. Almost all machine learning tools are binary classifiers and involve only identifying SL pairs. Most importantly, there are limited properties known that best describe and discriminate SL from SV. We developed MAGICAL (Multi-class Approach for Genetic Interaction in Cancer via Algorithm Learning), a multi-class random forest based machine learning model for genetic interaction prediction. Network properties of protein derived from physical protein-protein interactions are used as features to classify SL and SV. The model results in an accuracy of ~80% for the training dataset (CGIdb, BioGRID, and SynLethDB) and performs well on DepMap and other experimentally derived reported datasets. Amongst all the network properties, the shortest path, average neighbor2, average betweenness, average triangle, and adhesion have significant discriminatory power. MAGICAL is the first multi-class model to identify discriminatory features of synthetic lethal and viable interactions. MAGICAL can predict SL and SV interactions with better accuracy and precision than any existing binary classifier.