机器学习分析肺鳞状细胞癌基因表达数据集,发现了新的预后标志。
Machine learning analysis of lung squamous cell carcinoma gene expression datasets reveals novel prognostic signatures.
发表日期:2023 Aug 29
作者:
Hemant Kumar Joon, Anamika Thalor, Dinesh Gupta
来源:
COMPUTERS IN BIOLOGY AND MEDICINE
摘要:
肺鳞状细胞癌(LUSC)患者通常在晚期被诊断,并预后不良。因此,鉴定LUSC的新生物标志物非常重要。通过从NCBI-GEO存储库获取并合并多个数据集,构建完整数据集。我们还构建了一个仅包含已知癌症驱动基因的子集。此外,使用机器学习分类器从两个数据集中获得最佳特征。同时,我们进行差异基因表达分析。此外,进行了生存和富集分析。kNN分类器在完整数据集和驱动基因数据集的前40和50个基因特征上表现较好。在这90个基因特征中,发现有35个基因的调控存在差异。Lasso惩罚Cox回归进一步将基因数量减少至8个。这八个基因的中位风险评分能够明显将患者分层,低风险患者的总体生存率显著更好。我们在TCGA数据集上验证了这八个基因的鲁棒性表现。通路富集分析发现这些基因与细胞周期、细胞增殖和迁移相关。该研究证明,涉及机器学习和系统生物学的综合方法可有效鉴定LUSC的新生物标志物。版权所有 © 2023 Elsevier Ltd.。保留所有权利。
Lung squamous cell carcinoma (LUSC) patients are often diagnosed at an advanced stage and have poor prognoses. Thus, identifying novel biomarkers for the LUSC is of utmost importance.Multiple datasets from the NCBI-GEO repository were obtained and merged to construct the complete dataset. We also constructed a subset from this complete dataset with only known cancer driver genes. Further, machine learning classifiers were employed to obtain the best features from both datasets. Simultaneously, we perform differential gene expression analysis. Furthermore, survival and enrichment analyses were performed.The kNN classifier performed comparatively better on the complete and driver datasets' top 40 and 50 gene features, respectively. Out of these 90 gene features, 35 were found to be differentially regulated. Lasso-penalized Cox regression further reduced the number of genes to eight. The median risk score of these eight genes significantly stratified the patients, and low-risk patients have significantly better overall survival. We validated the robust performance of these eight genes on the TCGA dataset. Pathway enrichment analysis identified that these genes are associated with cell cycle, cell proliferation, and migration.This study demonstrates that an integrated approach involving machine learning and system biology may effectively identify novel biomarkers for LUSC.Copyright © 2023 Elsevier Ltd. All rights reserved.