五个关键的基因标记物在肝细胞癌中具有最佳性能。
Five Critical Gene-Based Biomarkers With Optimal Performance for Hepatocellular Carcinoma.
发表日期:2023
作者:
Yongjun Liu, Heping Zhang, Yuqing Xu, Yao-Zhong Liu, David P Al-Adra, Matthew M Yeh, Zhengjun Zhang
来源:
TROPICAL MEDICINE & INTERNATIONAL HEALTH
摘要:
肝细胞癌(HCC)是世界上最致命的癌症之一。急需了解HCC的分子背景,以促进生物标志物的鉴定和发现有效的治疗靶点。已发表的转录组学研究报告了大量在HCC中具有个体显著性的基因。然而,可靠的生物标志物仍待确定。本研究基于最大线性竞争风险因子模型,开发了一种机器学习分析框架,用于分析转录组数据,以识别差异表达基因(DEGs)的最小集。通过分析9个公共全转录组数据集(包含1184个HCC样本和672个非肿瘤对照样本),我们在HCC和对照样本之间鉴定了5个关键的差异表达基因(DEGs)(即CCDC107,CXCL12,GIGYF1,GMNN和IFFO1)。基于这5个DEGs构建的分类器在HCC的鉴定中表现出几乎完美的性能。我们进一步在我们收集的一个美国白种人队列中验证了这5个DEGs的性能(包含17个HCC和配对的非肿瘤组织)。我们工作的概念进展在于在分析框架中建模基因间相互作用并纠正批次效应。基于这5个DEGs构建的分类器展示了HCC的清晰特征模式。结果具有可解释性,鲁棒性,并且在不同队列/人群和不同疾病病因中可重现,表明这5个DEGs是可以在基因组水平上描述HCC的整体特征的内在变量。本研究所应用的分析框架可能为改善人类肿瘤的转录组分析开辟了一条新路。© 作者(们) 2023.
Hepatocellular carcinoma (HCC) is one of the most fatal cancers in the world. There is an urgent need to understand the molecular background of HCC to facilitate the identification of biomarkers and discover effective therapeutic targets. Published transcriptomic studies have reported a large number of genes that are individually significant for HCC. However, reliable biomarkers remain to be determined. In this study, built on max-linear competing risk factor models, we developed a machine learning analytical framework to analyze transcriptomic data to identify the most miniature set of differentially expressed genes (DEGs). By analyzing 9 public whole-transcriptome datasets (containing 1184 HCC samples and 672 nontumor controls), we identified 5 critical differentially expressed genes (DEGs) (ie, CCDC107, CXCL12, GIGYF1, GMNN, and IFFO1) between HCC and control samples. The classifiers built on these 5 DEGs reached nearly perfect performance in identification of HCC. The performance of the 5 DEGs was further validated in a US Caucasian cohort that we collected (containing 17 HCC with paired nontumor tissue). The conceptual advance of our work lies in modeling gene-gene interactions and correcting batch effect in the analytic framework. The classifiers built on the 5 DEGs demonstrated clear signature patterns for HCC. The results are interpretable, robust, and reproducible across diverse cohorts/populations with various disease etiologies, indicating the 5 DEGs are intrinsic variables that can describe the overall features of HCC at the genomic level. The analytical framework applied in this study may pave a new way for improving transcriptome profiling analysis of human cancers.© The Author(s) 2023.