使用 InfoHiC 通过全基因组测序预测 3D 癌症基因组。
Prediction of the 3D cancer genome from whole-genome sequencing using InfoHiC.
发表日期:2024 Sep 25
作者:
Yeonghun Lee, Sung-Hye Park, Hyunju Lee
来源:
Molecular Systems Biology
摘要:
癌症中的 3D 基因组预测对于揭示结构变异 (SV) 对肿瘤发生的影响至关重要,特别是当它们存在于非编码区域时。我们推出了 InfoHiC,这是一个直接从全基因组测序 (WGS) 预测 3D 癌症基因组的系统框架。 InfoHiC 在 SV 重叠群组装上利用重叠群特异性拷贝数编码,并执行重叠群到总 Hi-C 的转换,以根据多个 SV 重叠群进行癌症 Hi-C 预测。我们证明,InfoHiC 可以使用乳腺癌细胞系数据预测所有类型 SV 的 3D 基因组折叠。我们将其应用于乳腺癌患者和髓母细胞瘤儿科患者的全基因组测序数据,并确定了新的拓扑关联域。对于乳腺癌,我们发现超级增强子劫持事件与致癌过度表达和不良生存结果相关。对于髓母细胞瘤,我们在非编码区发现了 SV,导致髓母细胞瘤驱动基因(GFI1、GFI1B 和 PRDM6)的超级增强子劫持事件。此外,我们在 https://github.com/dmcb-gist/InfoHiC 上提供了来自 WGS 的经过训练的癌症 Hi-C 预测模型,揭示了 SV 对癌症患者的影响并揭示了新的治疗靶点。© 2024。作者( s)。
The 3D genome prediction in cancer is crucial for uncovering the impact of structural variations (SVs) on tumorigenesis, especially when they are present in noncoding regions. We present InfoHiC, a systemic framework for predicting the 3D cancer genome directly from whole-genome sequencing (WGS). InfoHiC utilizes contig-specific copy number encoding on the SV contig assembly, and performs a contig-to-total Hi-C conversion for the cancer Hi-C prediction from multiple SV contigs. We showed that InfoHiC can predict 3D genome folding from all types of SVs using breast cancer cell line data. We applied it to WGS data of patients with breast cancer and pediatric patients with medulloblastoma, and identified neo topologically associating domains. For breast cancer, we discovered super-enhancer hijacking events associated with oncogenic overexpression and poor survival outcomes. For medulloblastoma, we found SVs in noncoding regions that caused super-enhancer hijacking events of medulloblastoma driver genes (GFI1, GFI1B, and PRDM6). In addition, we provide trained models for cancer Hi-C prediction from WGS at https://github.com/dmcb-gist/InfoHiC , uncovering the impacts of SVs in cancer patients and revealing novel therapeutic targets.© 2024. The Author(s).