研究动态
Articles below are published ahead of final publication in an issue. Please cite articles in the following format: authors, (year), title, journal, DOI.

一种基于显式编码多任务和谐搜索的高阶 SNP 上位相互作用检测新方法。

A Novel Detection Method for High-Order SNP Epistatic Interactions Based on Explicit-Encoding-Based Multitasking Harmony Search.

发表日期:2024 Jul 02
作者: Shouheng Tuo, Jiewei Jiang
来源: ARTHRITIS RESEARCH & THERAPY

摘要:

为了阐明复杂疾病的遗传基础,发现导致疾病易感性的单核苷酸多态性(SNP)至关重要。这对于高阶 SNP 上位相互作用 (HEIs) 来说尤其具有挑战性,它表现出较小的个体效应,但可能存在较大的联合效应。由于搜索空间巨大,包含数十亿种可能的组合,并且评估它们的计算复杂性,这些相互作用很难检测。本研究提出了一种新颖的基于显式编码的多任务和谐搜索算法(MTHS-EE-DHEI),专门为解决这一挑战而设计。该算法分三个阶段运行。首先,采用和谐搜索算法,利用贝叶斯网络和熵等四种轻量级评估函数,有效地探索与疾病状态相关的潜在SNP组合。其次,应用G检验统计方法过滤掉不显着的SNP组合。最后,采用两种基于机器学习的方法,即多因子降维(MDR)和随机森林(RF)来验证剩余重要 SNP 组合的分类性能。本研究旨在证明与现有方法相比,MTHS-EE-DHEI 在识别 HEIs 方面的有效性,可能为复杂疾病的遗传结构提供有价值的见解。 MTHS-EE-DHEI 的性能在 20 个模拟疾病数据集和三个真实数据集(包括年龄相关性黄斑变性 (AMD)、类风湿性关节炎 (RA) 和乳腺癌 (BC))上进行了评估。结果清楚地表明,MTHS-EE-DHEI 在检测能力和计算效率方面均优于四种最先进的算法。源代码可在 https://github.com/shouhengtuo/MTHS-EE-DHEI.git 获取。© 2024。国际跨学科领域科学家协会。
To elucidate the genetic basis of complex diseases, it is crucial to discover the single-nucleotide polymorphisms (SNPs) contributing to disease susceptibility. This is particularly challenging for high-order SNP epistatic interactions (HEIs), which exhibit small individual effects but potentially large joint effects. These interactions are difficult to detect due to the vast search space, encompassing billions of possible combinations, and the computational complexity of evaluating them. This study proposes a novel explicit-encoding-based multitasking harmony search algorithm (MTHS-EE-DHEI) specifically designed to address this challenge. The algorithm operates in three stages. First, a harmony search algorithm is employed, utilizing four lightweight evaluation functions, such as Bayesian network and entropy, to efficiently explore potential SNP combinations related to disease status. Second, a G-test statistical method is applied to filter out insignificant SNP combinations. Finally, two machine learning-based methods, multifactor dimensionality reduction (MDR) as well as random forest (RF), are employed to validate the classification performance of the remaining significant SNP combinations. This research aims to demonstrate the effectiveness of MTHS-EE-DHEI in identifying HEIs compared to existing methods, potentially providing valuable insights into the genetic architecture of complex diseases. The performance of MTHS-EE-DHEI was evaluated on twenty simulated disease datasets and three real-world datasets encompassing age-related macular degeneration (AMD), rheumatoid arthritis (RA), and breast cancer (BC). The results demonstrably indicate that MTHS-EE-DHEI outperforms four state-of-the-art algorithms in terms of both detection power and computational efficiency. The source code is available at https://github.com/shouhengtuo/MTHS-EE-DHEI.git .© 2024. International Association of Scientists in the Interdisciplinary Areas.