研究动态
Articles below are published ahead of final publication in an issue. Please cite articles in the following format: authors, (year), title, journal, DOI.

多重空间蛋白质组数据中多个感兴趣区域的统计分析。

Statistical analysis of multiple regions-of-interest in multiplexed spatial proteomics data.

发表日期:2024 Sep 23
作者: Sarah Samorodnitsky, Michael C Wu
来源: BRIEFINGS IN BIOINFORMATICS

摘要:

多重空间蛋白质组学揭示了肿瘤细胞的空间组织,这与生存和治疗反应等重要的临床结果相关。这种空间组织通常使用空间汇总统计数据进行总结,包括 Ripley's K 和 Besag's L。然而,如果对同一肿瘤的多个区域进行成像,则不清楚如何综合与单个患者水平终点的关系。我们评估现有的在将汇总统计数据与结果相关联的情况下容纳多个图像的方法。首先,我们考虑基于平均的方法,其中单个样本的多个摘要被组合成加权平均值。然后,我们提出了一类新颖的集成测试方法,其中我们模拟用于汇总摘要的随机权重,测试与结果的关联,并组合 $P$ 值。我们通过模拟和应用非小细胞肺癌、结直肠癌和三阴性乳腺癌的数据来系统地评估这些方法的性能。我们发现最佳策略各不相同,但基于每个图像中的细胞数量的汇总统计的简单加权平均值通常提供最高功效并有效控制 I 型错误。当成像区域的大小变化时,在变化的大小提供信息的情况下,将这种变化合并到加权聚合中可以产生额外的功效。集成测试(但不是重采样)在模拟数据集中的条件下提供了高功率和 I 类错误控制。© 作者 2024。由牛津大学出版社出版。
Multiplexed spatial proteomics reveals the spatial organization of cells in tumors, which is associated with important clinical outcomes such as survival and treatment response. This spatial organization is often summarized using spatial summary statistics, including Ripley's K and Besag's L. However, if multiple regions of the same tumor are imaged, it is unclear how to synthesize the relationship with a single patient-level endpoint. We evaluate extant approaches for accommodating multiple images within the context of associating summary statistics with outcomes. First, we consider averaging-based approaches wherein multiple summaries for a single sample are combined in a weighted mean. We then propose a novel class of ensemble testing approaches in which we simulate random weights used to aggregate summaries, test for an association with outcomes, and combine the $P$-values. We systematically evaluate the performance of these approaches via simulation and application to data from non-small cell lung cancer, colorectal cancer, and triple negative breast cancer. We find that the optimal strategy varies, but a simple weighted average of the summary statistics based on the number of cells in each image often offers the highest power and controls type I error effectively. When the size of the imaged regions varies, incorporating this variation into the weighted aggregation may yield additional power in cases where the varying size is informative. Ensemble testing (but not resampling) offered high power and type I error control across conditions in our simulated data sets.© The Author(s) 2024. Published by Oxford University Press.