研究动态
Articles below are published ahead of final publication in an issue. Please cite articles in the following format: authors, (year), title, journal, DOI.

癌症复发的形态: 拓扑数据分析预测儿童急性淋巴细胞白血病的复发。

The shape of cancer relapse: Topological data analysis predicts recurrence in paediatric acute lymphoblastic leukaemia.

发表日期:2023 Aug 14
作者: Salvador Chulián, Bernadette J Stolz, Álvaro Martínez-Rubio, Cristina Blázquez Goñi, Juan F Rodríguez Gutiérrez, Teresa Caballero Velázquez, Águeda Molinos Quintana, Manuel Ramírez Orellana, Ana Castillo Robleda, José Luis Fuster Soler, Alfredo Minguela Puras, María V Martínez Sánchez, María Rosa, Víctor M Pérez-García, Helen M Byrne
来源: PLoS Computational Biology

摘要:

尽管急性淋巴细胞白血病(ALL)的儿童和青少年存活率较高,但约有15-20%的患者会复发。复发风险通常在诊断时通过生物学因素(包括流式细胞仪数据)评估。这种高维数据通常通过将其投影到一组生物标志物上手动评估。细胞密度和数据的二维投影中的“空白区域”,即没有细胞的区域,然后用于定性评估。在本研究中,我们使用拓扑数据分析(TDA),量化数据中的形状,包括空白区域,来分析已知患者预后的治疗前ALL数据集。我们将这些完全无监督的分析与机器学习(ML)相结合,以识别显著的形状特征,并证明它们能准确预测复发风险,尤其是对于先前被分类为“低风险”的患者。我们独立验证了CD10、CD20、CD38和CD45作为ALL诊断的生物标志物的预测能力。基于我们的分析,我们提出了三个逐渐详细的预测流程,根据技术和技术可用性分析ALL患者的流式细胞仪数据:1. 在数据的二参数投影中对特定生物学特征进行可视化检查;2. 计算此类投影的定量拓扑描述符;3. 在由CD10、CD20、CD38和CD45定义的四参数空间中,使用TDA和ML进行综合分析。我们的分析可轻松扩展到其他血液恶性肿瘤。版权:© 2023 Chulián等人。本文为开放获取文章,依照创作共用许可证的规定,任何人都可以自由使用、分发和复制本文,在保留原作者和出处的前提下。
Although children and adolescents with acute lymphoblastic leukaemia (ALL) have high survival rates, approximately 15-20% of patients relapse. Risk of relapse is routinely estimated at diagnosis by biological factors, including flow cytometry data. This high-dimensional data is typically manually assessed by projecting it onto a subset of biomarkers. Cell density and "empty spaces" in 2D projections of the data, i.e. regions devoid of cells, are then used for qualitative assessment. Here, we use topological data analysis (TDA), which quantifies shapes, including empty spaces, in data, to analyse pre-treatment ALL datasets with known patient outcomes. We combine these fully unsupervised analyses with Machine Learning (ML) to identify significant shape characteristics and demonstrate that they accurately predict risk of relapse, particularly for patients previously classified as 'low risk'. We independently confirm the predictive power of CD10, CD20, CD38, and CD45 as biomarkers for ALL diagnosis. Based on our analyses, we propose three increasingly detailed prognostic pipelines for analysing flow cytometry data from ALL patients depending on technical and technological availability: 1. Visual inspection of specific biological features in biparametric projections of the data; 2. Computation of quantitative topological descriptors of such projections; 3. A combined analysis, using TDA and ML, in the four-parameter space defined by CD10, CD20, CD38 and CD45. Our analyses readily extend to other haematological malignancies.Copyright: © 2023 Chulián et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.