研究动态
Articles below are published ahead of final publication in an issue. Please cite articles in the following format: authors, (year), title, journal, DOI.

通过先进的特征选择和机器学习技术增强转移性结直肠癌的预测。

Enhancing metastatic colorectal cancer prediction through advanced feature selection and machine learning techniques.

发表日期:2024 Sep 02
作者: Hui Yang, Jun Liu, Na Yang, Qingsheng Fu, Yingying Wang, Mingquan Ye, Shaoneng Tao, Xiaocen Liu, Qingqing Li
来源: INTERNATIONAL IMMUNOPHARMACOLOGY

摘要:

结直肠癌(CRC)是全球第三大流行癌症,由于其高转移率而构成重大挑战。大约 20% 的 CRC 患者在诊断时就出现远处转移,超过 50% 的患者在五年内出现转移。准确预测转移对于改善 CRC 患者的生存结果至关重要。本研究引入了一种创新的成本敏感的基于快速相关的过滤器 (CS-FCBF) 算法进行特征选择,并与机器学习技术相结合来预测转移性 CRC。 CS-FCBF算法有效地将基因组特征的数量从184个减少到9个关键基因:CXCL9、C2CD4B、RGCC、GFI1、BEX2、CXCL3、FOXQ1、PBK和PLAG1。该方法结合体外、体内以及公开的单细胞RNA-seq数据集的分析来验证研究结果。CS-FCBF算法的应用导致预测模型性能显着提升,平均提升21.16%在精确率-召回率曲线下的区域。这九个已鉴定的基因具有作为转移性结直肠癌的诊断生物标志物和治疗靶点的潜力。这项研究强调了先进的特征选择方法与机器学习相结合,在解决医学诊断中类别不平衡的挑战方面的关键作用,特别是对于结直肠癌。早期检测转移至关重要,已识别的基因强调了它们在结直肠癌转移过程中的重要性。这里应用的方法提供了宝贵的见解,并为面临类似诊断挑战的其他癌症或疾病的未来研究铺平了道路。版权所有 © 2024 Elsevier B.V. 保留所有权利。
Colorectal cancer (CRC) is the third most prevalent cancer globally, posing a significant challenge due to its high rate of metastasis. Approximately 20% of patients with CRC present with distant metastases at diagnosis, and over 50% develop metastases within five years. Accurate prediction of metastasis is crucial for improving survival outcomes in patients with CRC.This study introduces an innovative cost-sensitive fast correlation-based filter (CS-FCBF) algorithm for feature selection, integrated with machine learning techniques to predict metastatic CRC. The CS-FCBF algorithm effectively reduced the number of genomic features from 184 to 9 critical genes: CXCL9, C2CD4B, RGCC, GFI1, BEX2, CXCL3, FOXQ1, PBK, and PLAG1. The methodology combined in vitro, in vivo, and analysis of publicly available single-cell RNA-seq datasets to validate the findings.The application of the CS-FCBF algorithm led to a significant improvement in prediction model performance, with an average 21.16% increase in the area under the precision-recall curve. The nine identified genes hold potential as diagnostic biomarkers and therapeutic targets for metastatic CRC.This study highlights the critical role of advanced feature selection methods, combined with machine learning, in addressing the challenge of class imbalance in medical diagnosis, particularly for CRC. Early detection of metastasis is vital, and the identified genes underscore their importance in the metastatic process of CRC. The methodology applied here offers valuable insights and paves the way for future research in other cancers or diseases that face similar diagnostic challenges.Copyright © 2024 Elsevier B.V. All rights reserved.