研究动态
Articles below are published ahead of final publication in an issue. Please cite articles in the following format: authors, (year), title, journal, DOI.

疾病基因的基因表达景观。

The Gene Expression Landscape of Disease Genes.

发表日期:2024 Jun 21
作者: Judit García-González, Saul Garcia-Gonzalez, Lathan Liou, Paul F O'Reilly
来源: Alzheimers & Dementia

摘要:

最新全基因组关联研究 (GWAS) 结果中应用的精细绘图和基因优先级技术已优先考虑与疾病因果相关的数百个基因。在这里,我们利用这些最近编制的高置信度致病基因列表来询问疾病基因在身体的哪个部位发挥作用。具体来说,我们结合了与 16 种主要疾病(包括 8 种癌症)相关的 46 个组织和 204 个细胞类型的 GWAS 汇总统计数据、基因优先级结果和基因表达 RNA-seq 数据。在与疾病具有明确相关性的组织和细胞类型中,与非优先“对照”基因相比,优先基因通常具有更高的绝对和相对(即组织/细胞特异性)表达。例如,精神疾病中的脑组织 (P 值 < 1×10 -7 )、阿尔茨海默病中的小胶质细胞 (P 值 = 9.8×10 -3 ) 和结直肠癌中的结肠粘膜 (P 值 < 1×10 - 3).我们还观察到多种组织和细胞类型中疾病基因的表达显着升高,但与相应疾病没有确定的联系。虽然其中一些结果可以通过跨越多个组织的细胞类型来解释,例如与阿尔茨海默病相关的脑、血液、肺和脾中的巨噬细胞(P 值 < 1×10 -3 ),但其他结果的原因尚不清楚并激发进一步的研究,可能为疾病病因学提供新的见解。例如,2型糖尿病中的乳腺组织(P值<1×10 -7 );冠状动脉疾病中的生殖组织,如乳房、子宫、阴道和前列腺(P值<1×10 -4 );和精神疾病中的运动神经元(P值<3×10 -4 )。在 GTEx 数据集中,组织类型是基因表达的主要预测因子,但每个预测因子(组织、样本、受试者、批次)的贡献在疾病相关基因之间差异很大。最后,我们突出显示相关组织中基因表达水平最高的基因,以指导功能性后续研究。我们的结果可以为参与疾病发生的组织和细胞提供新的见解,为药物靶标和递送策略提供信息,强调潜在的脱靶效应,并举例说明将疾病基因与组织和细胞类型基因表达联系起来的不同统计测试的相对性能。
Fine-mapping and gene-prioritisation techniques applied to the latest Genome-Wide Association Study (GWAS) results have prioritised hundreds of genes as causally associated with disease. Here we leverage these recently compiled lists of high-confidence causal genes to interrogate where in the body disease genes operate. Specifically, we combine GWAS summary statistics, gene prioritisation results and gene expression RNA-seq data from 46 tissues and 204 cell types in relation to 16 major diseases (including 8 cancers). In tissues and cell types with well-established relevance to the disease, the prioritised genes typically have higher absolute and relative (i.e. tissue/cell specific) expression compared to non-prioritised 'control' genes. Examples include brain tissues in psychiatric disorders ( P -value < 1×10 -7 ), microglia cells in Alzheimer's Disease ( P -value = 9.8×10 -3 ) and colon mucosa in colorectal cancer ( P -value < 1×10 -3 ). We also observe significantly higher expression for disease genes in multiple tissues and cell types with no established links to the corresponding disease. While some of these results may be explained by cell types that span multiple tissues, such as macrophages in brain, blood, lung and spleen in relation to Alzheimer's disease ( P -values < 1×10 -3 ), the cause for others is unclear and motivates further investigation that may provide novel insights into disease etiology. For example, mammary tissue in Type 2 Diabetes ( P -value < 1×10 -7 ); reproductive tissues such as breast, uterus, vagina, and prostate in Coronary Artery Disease ( P -value < 1×10 -4 ); and motor neurons in psychiatric disorders ( P -value < 3×10 -4 ). In the GTEx dataset, tissue type is the major predictor of gene expression but the contribution of each predictor (tissue, sample, subject, batch) varies widely among disease-associated genes. Finally, we highlight genes with the highest levels of gene expression in relevant tissues to guide functional follow-up studies. Our results could offer novel insights into the tissues and cells involved in disease initiation, inform drug target and delivery strategies, highlighting potential off-target effects, and exemplify the relative performance of different statistical tests for linking disease genes with tissue and cell type gene expression.