考虑将现实世界数据汇总作为单臂试验的比较队列的因素：对异质性评估的模拟研究。

Considerations for pooling real-world data as a comparator cohort to a single arm trial: a simulation study on assessment of heterogeneity.

Original text

发表日期：2023 Aug 24

作者： Daniel Backenroth, Trevor Royce, Jose Pinheiro, Meghna Samant, Olivier Humblet

来源： BMC Medical Research Methodology

摘要：

新颖的精准医学治疗靶向基因组定义明确的人群。罕见的亚群使得在临床试验或单一实际世界数据(RWD)来源内研究变得具有挑战性，因此可能需要从不同的RWD来源中进行数据汇总以保证可行性。对于汇总数据的异质性评估尤其复杂，当对比汇总的实际世界对照组(rwCC)和单臂临床试验(SAT)时，因为个别比较是不独立的，所有比较都是将rwCC与相同的SAT进行对比。我们的目标是为基于rwCC用例的RWD汇总开发一个方法论框架，并模拟异质性评估的新方法，特别针对小数据集。我们提出了一个包括以下步骤的框架：预先确定、数据集合格性评估和结果分析(包括结果异质性评估)。然后，我们使用标准的元分析方法和修正Cochran's Q检验，以及直接比较来自rwCC的个体参与者数据(IPD)，模拟了一项关于SAT与两个rwCC之间的二进制响应结果的异质性评估。我们发现，调整后的Cochran's Q检验和IPD方法在检测真实差异方面具有相同的统计功效，两种方法均优于标准的Cochran's Q检验。当在SAT与rwCC之间的零差异场景中评估异质性的影响时，统计功效不足导致一类错误的增加。同样，在SAT与rwCC之间存在真实差异的替代场景中，我们发现严重的二类错误，异质性测试缺乏统计功效导致对治疗效果的低估。在设计SAT的rwCC过程中，我们开发了一种汇总RWD来源的方法论框架。在这个过程中进行异质性检验时，调整后的Cochran's Q检验与IPD异质性检验的统计功效相匹配。定量异质性测试在防止一类或二类错误方面的局限性表明，这些测试最好用于描述性，并在基于临床/数据考虑仔细选择数据集后使用。我们希望这些发现能促进严谨地汇总RWD以揭示对肿瘤学患者有益的深刻见解。 © 2023. BioMed Central有限公司，施普林格自然出版集团的一部分。

Novel precision medicine therapeutics target increasingly granular, genomically-defined populations. Rare sub-groups make it challenging to study within a clinical trial or single real-world data (RWD) source; therefore, pooling from disparate sources of RWD may be required for feasibility. Heterogeneity assessment for pooled data is particularly complex when contrasting a pooled real-world comparator cohort (rwCC) with a single-arm clinical trial (SAT), because the individual comparisons are not independent as all compare a rwCC to the same SAT. Our objective was to develop a methodological framework for pooling RWD focused on the rwCC use case, and simulate novel approaches of heterogeneity assessment, especially for small datasets.We present a framework with the following steps: pre-specification, assessment of dataset eligibility, and outcome analyses (including assessment of outcome heterogeneity). We then simulated heterogeneity assessments for a binary response outcome in a SAT compared to two rwCCs, using standard methods for meta-analysis, and an Adjusted Cochran's Q test, and directly comparing the individual participant data (IPD) from the rwCCs.We found identical power to detect a true difference for the adjusted Cochran's Q test and the IPD method, with both approaches superior to a standard Cochran's Q test. When assessing the impact of heterogeneity in the null scenario of no difference between the SAT and rwCCs, a lack of statistical power led to Type 1 error inflation. Similarly, in the alternative scenario of a true difference between SAT and rwCCs, we found substantial Type 2 error, with underpowered heterogeneity testing leading to underestimation of the treatment effect.We developed a methodological framework for pooling RWD sources in the context of designing a rwCC for a SAT. When testing for heterogeneity during this process, the adjusted Cochran's Q test matches the statistical power of IPD heterogeneity testing. Limitations of quantitative heterogeneity testing in protecting against Type 1 or Type 2 error indicate these tests are best used descriptively, and after careful selection of datasets based on clinical/data considerations. We hope these findings will facilitate the rigorous pooling of RWD to unlock insights to benefit oncology patients.© 2023. BioMed Central Ltd., part of Springer Nature.