MIMOSA:一种由改进的甲基化预测模型组成的资源,增强了识别 DNA 甲基化-表型关联的能力。
MIMOSA: a resource consisting of improved methylome prediction models increases power to identify DNA methylation-phenotype associations.
发表日期:2024 Dec
作者:
Hunter J Melton, Zichen Zhang, Hong-Wen Deng, Lang Wu, Chong Wu
来源:
Epigenetics
摘要:
尽管 DNA 甲基化 (DNAm) 与许多复杂疾病的发病机制有关,从癌症到心血管疾病再到自身免疫性疾病,但在这些过程中发挥关键作用的确切甲基化位点仍然难以捉摸。识别假定因果 CpG 位点并增强疾病病因学理解的一种策略是进行全甲基化关联研究 (MWAS),其中可以识别与复杂疾病相关的预测 DNA 甲基化。然而,当前的MWAS模型主要使用单个研究的数据进行训练,从而限制了甲基化预测的准确性和后续关联研究的力量。在这里,我们引入了一种新资源,MWAS 估算甲基化组义务摘要级 mQTL 和相关 LD 矩阵 (MIMOSA),这是一组模型,可通过使用大型摘要级大幅提高 DNA 甲基化和后续 MWAS 预测的准确性。 mQTL 数据集由 DNA 甲基化联盟 (GoDMC) 提供。通过对 28 种复杂性状和疾病的 GWAS(全基因组关联研究)汇总统计分析,我们证明 MIMOSA 显着提高了全血 DNA 甲基化预测的准确性,针对低遗传力 CpG 位点构建了卓有成效的预测模型,并显着确定了全血 DNA 甲基化预测的准确性。比之前的方法有更多的 CpG 位点-表型关联。最后,我们使用 MIMOSA 进行高胆固醇案例研究,查明 146 个假定的因果 CpG 位点。
Although DNA methylation (DNAm) has been implicated in the pathogenesis of numerous complex diseases, from cancer to cardiovascular disease to autoimmune disease, the exact methylation sites that play key roles in these processes remain elusive. One strategy to identify putative causal CpG sites and enhance disease etiology understanding is to conduct methylome-wide association studies (MWASs), in which predicted DNA methylation that is associated with complex diseases can be identified. However, current MWAS models are primarily trained using the data from single studies, thereby limiting the methylation prediction accuracy and the power of subsequent association studies. Here, we introduce a new resource, MWAS Imputing Methylome Obliging Summary-level mQTLs and Associated LD matrices (MIMOSA), a set of models that substantially improve the prediction accuracy of DNA methylation and subsequent MWAS power through the use of a large summary-level mQTL dataset provided by the Genetics of DNA Methylation Consortium (GoDMC). Through the analyses of GWAS (genome-wide association study) summary statistics for 28 complex traits and diseases, we demonstrate that MIMOSA considerably increases the accuracy of DNA methylation prediction in whole blood, crafts fruitful prediction models for low heritability CpG sites, and determines markedly more CpG site-phenotype associations than preceding methods. Finally, we use MIMOSA to conduct a case study on high cholesterol, pinpointing 146 putatively causal CpG sites.