研究动态
Articles below are published ahead of final publication in an issue. Please cite articles in the following format: authors, (year), title, journal, DOI.

通过调用 scRNA-seq 数据中的单核苷酸变异来识别癌细胞。

Identifying cancer cells from calling single-nucleotide variants in scRNA-seq data.

发表日期:2024 Aug 20
作者: Valérie Marot-Lassauzaie, Sergi Beneyto-Calabuig, Benedikt Obermayer, Lars Velten, Dieter Beule, Laleh Haghverdi
来源: BIOINFORMATICS

摘要:

单细胞 RNA 测序 (scRNA-seq) 数据广泛用于研究癌细胞状态及其异质性。然而,肿瘤微环境通常是健康细胞和癌细胞的混合物,仅根据转录组学很难完全分离这两个群体。如果可用,在 scRNA-seq 数据中观察到的体细胞单核苷酸变异 (SNV) 可用于识别癌症群体,并将该信息与单细胞的表达谱进行匹配。然而,在 scRNA-seq 数据中调用体细胞 SNV 是一项具有挑战性的任务,因为短读数据中看到的大多数变异不是体细胞变异,而可能是种系变异、RNA 编辑或转录、测序或处理错误。此外,数据中只能看到每个细胞的活跃转录区域中存在的变异。为了解决这些挑战,我们开发了 CCLONE(噪声表达上的癌细胞标记),这是一种可解释的工具,适用于处理 SNV 的不确定性和稀疏性,称为来自 scRNA-seq 数据。 CCLONE 联合鉴定癌症克隆群体及其相关变体。我们将 CCLONE 应用于两个急性髓系白血病数据集和一个肺腺癌数据集,并表明 CCLONE 捕获了多个患者的基因克隆和体细胞事件。这些结果展示了如何使用 CCLONE 来深入了解 scRNA-seq 数据中的疾病进程和癌细胞起源。源代码可在 github.com/HaghverdiLab/CCLONE 上获取。补充数据可在 Bioinformatics online 上获取。 © 作者 2024。由牛津大学出版社出版。
Single cell RNA sequencing (scRNA-seq) data is widely used to study cancer cell states and their heterogeneity. However, the tumour microenvironment is usually a mixture of healthy and cancerous cells and it can be difficult to fully separate these two populations based on transcriptomics alone. If available, somatic single nucleotide variants (SNVs) observed in the scRNA-seq data could be used to identify the cancer population and match that information with the single cells' expression profile. However, calling somatic SNVs in scRNA-seq data is a challeng-ing task, as most variants seen in the short read data are not somatic, but can instead be germline variants, RNA edits or transcription, sequencing or processing errors. Additionally, only variants present in actively transcribed regions for each individual cell will be seen in the data.To address these challenges, we develop CCLONE (Cancer Cell Labelling On Noisy Expression), an interpretable tool adapted to handle the uncertainty and sparsity of SNVs called from scRNA-seq data. CCLONE jointly identifies cancer clonal populations, and their associated variants. We apply CCLONE on two acute myeloid leukaemia datasets and one lung adenocarcinoma dataset and show that CCLONE captures both genetic clones and somatic events for multiple patients. These results show how CCLONE can be used to gather insight into the course of the disease and the origin of cancer cells in scRNA-seq data.Source code is available at github.com/HaghverdiLab/CCLONE.Supplementary data are available at Bioinformatics online.© The Author(s) 2024. Published by Oxford University Press.