研究动态
Articles below are published ahead of final publication in an issue. Please cite articles in the following format: authors, (year), title, journal, DOI.

乳腺癌的生存分析:评估集成学习技术的预测。

Survival analysis in breast cancer: evaluating ensemble learning techniques for prediction.

发表日期:2024
作者: Gonca Buyrukoğlu
来源: Disease Models & Mechanisms

摘要:

乳腺癌是全世界女性最常面临的癌症。尽管乳腺癌的研究和认识已经取得了相当大的进展,但由于疾病的异质性,仍然没有一种治疗方法。生存数据可能对乳腺癌研究特别重要,以了解其动态和复杂的轨迹。这项研究针对影响疾病进展的最重要的协变量。该研究利用德国乳腺癌研究组 2 (GBSG2) 和国际乳腺癌分子分类联盟数据集 (METABRIC) 数据集。在这两个数据集中,兴趣在于疾病的复发和复发发生的时间。采用 Cox 比例风险 (PH) 模型、随机生存森林 (RSF) 和条件推理森林 (Cforest) 三种模型来分析乳腺癌数据集。本研究的目标是将这些方法应用于乳腺癌进展的预测,并基于两种不同的估计方法(bootstrap 估计和 bootstrap .632 估计)比较它们的性能。模型性能通过一致性指数(C-index)和预测误差曲线(pec)进行评估以进行区分。与两个数据集的 RSF 和 Cforest 方法相比,Cox PH 模型具有较低的 C 指数和较大的预测误差。 GBSG2 和 METABRIC 数据集的分析结果表明,RSF 和 Cforest 算法为 Cox PH 模型提供了非参数替代方案,用于估计乳腺癌患者的生存概率。©2024 Buyrukoğlu。
Breast cancer is most commonly faced with form of cancer amongst women worldwide. In spite of the fact that the breast cancer research and awareness have gained considerable momentum, there is still no one treatment due to disease heterogeneity. Survival data may be of specific interest in breast cancer studies to understand its dynamic and complex trajectories. This study copes with the most important covariates affecting the disease progression. The study utilizes the German Breast Cancer Study Group 2 (GBSG2) and the Molecular Taxonomy of Breast Cancer International Consortium dataset (METABRIC) datasets. In both datasets, interests lie in relapse of the disease and the time when the relapse happens. The three models, namely the Cox proportional hazards (PH) model, random survival forest (RSF) and conditional inference forest (Cforest) were employed to analyse the breast cancer datasets. The goal of this study is to apply these methods in prediction of breast cancer progression and compare their performances based on two different estimation methods: the bootstrap estimation and the bootstrap .632 estimation. The model performance was evaluated in concordance index (C-index) and prediction error curves (pec) for discrimination. The Cox PH model has a lower C-index and bigger prediction error compared to the RSF and the Cforest approach for both datasets. The analysis results of GBSG2 and METABRIC datasets reveal that the RSF and the Cforest algorithms provide non-parametric alternatives to Cox PH model for estimation of the survival probability of breast cancer patients.©2024 Buyrukoğlu.