drGAT:利用药物-细胞-基因异质网络进行药物反应的注意力引导基因评估。
drGAT: Attention-Guided Gene Assessment of Drug Response Utilizing a Drug-Cell-Gene Heterogeneous Network.
发表日期:2024 May 14
作者:
Yoshitaka Inoue, Hunmin Lee, Tianfan Fu, Augustin Luna
来源:
BIOMEDICINE & PHARMACOTHERAPY
摘要:
药物开发是一个漫长的过程,失败率很高。机器学习越来越多地被用来促进药物开发过程。这些模型旨在增强我们对药物特性的理解,包括它们在生物背景下的活性。然而,药物反应(DR)预测的一个主要挑战是模型的可解释性,因为它有助于验证研究结果。这在生物医学中很重要,因为与药物与蛋白质相互作用的既定知识相比,模型需要易于理解。 drGAT 是一种图深度学习模型,利用由蛋白质、细胞系和药物之间的关系组成的异构图。 drGAT 的设计有两个目标:作为二元敏感性预测的 DR 预测和根据注意力系数阐明药物机制。 drGAT 表现出了优于现有模型的性能,对 NCI60 药物反应数据集的 269 种 DNA 损伤化合物实现了 78% 的准确度(和精密度)和 76% 的 F1 分数。为了评估模型的可解释性,我们对 Pubmed 摘要中的药物-基因共现情况进行了审查,并与每种药物关注系数最高的前 5 个基因进行了比较。我们还通过检查拓扑异构酶相关药物的邻域来检查模型中是否保留了已知的关系。例如,我们的模型保留了 TOP1 作为伊立替康和托泊替康的高权重预测特征,以及其他可能成为药物调节因子的基因。我们的方法可用于准确预测对药物的敏感性,并可用于识别与癌症患者治疗相关的生物标志物。
Drug development is a lengthy process with a high failure rate. Increasingly, machine learning is utilized to facilitate the drug development processes. These models aim to enhance our understanding of drug characteristics, including their activity in biological contexts. However, a major challenge in drug response (DR) prediction is model interpretability as it aids in the validation of findings. This is important in biomedicine, where models need to be understandable in comparison with established knowledge of drug interactions with proteins. drGAT, a graph deep learning model, leverages a heterogeneous graph composed of relationships between proteins, cell lines, and drugs. drGAT is designed with two objectives: DR prediction as a binary sensitivity prediction and elucidation of drug mechanism from attention coefficients. drGAT has demonstrated superior performance over existing models, achieving 78\% accuracy (and precision), and 76\% F1 score for 269 DNA-damaging compounds of the NCI60 drug response dataset. To assess the model's interpretability, we conducted a review of drug-gene co-occurrences in Pubmed abstracts in comparison to the top 5 genes with the highest attention coefficients for each drug. We also examined whether known relationships were retained in the model by inspecting the neighborhoods of topoisomerase-related drugs. For example, our model retained TOP1 as a highly weighted predictive feature for irinotecan and topotecan, in addition to other genes that could potentially be regulators of the drugs. Our method can be used to accurately predict sensitivity to drugs and may be useful in the identification of biomarkers relating to the treatment of cancer patients.