使用核函数平滑Lexis图表:一种当代方法。
Smoothing Lexis diagrams using kernel functions: A contemporary approach.
发表日期:2023 Aug 24
作者:
Philip S Rosenberg, Adalberto Miranda Filho, Julia Elrod, Aryana Arsham, Ana F Best, Pavel Chernyavskiy
来源:
STATISTICAL METHODS IN MEDICAL RESEARCH
摘要:
Lexis图是按年龄和期间进行索引的事件率的矩形数组。Lexis图的分析是癌症监测研究的基石。通常,基于人群的描述性研究会分析由性别、肿瘤特征、种族/种族、地理区域等定义的多个Lexis图。随着细分的增加,每个Lexis图所包含的信息量不可避免地减少。已经提出了几种方法,可以在前期对观测到的Lexis图进行平滑处理,以澄清显著的模式,并改进平均、梯度和趋势的摘要估计。在本文中,我们开发了一种新的基于双变量核平滑器,该平滑器结合了两个关键创新。首先,对于任何给定的核函数,我们计算其奇异值分解,并根据经校正的AIC选择一个最优的截断点-要保留的前导奇异向量的数量。其次,我们对一组具有不同形状和带宽的候选核函数进行模型平均。截断模型平均方法快速、自动,并具有良好的性能,并提供考虑模型选择的方差-协方差矩阵。我们提供了一项深入的案例研究(美国非西班牙裔白人女性中的侵袭性雌激素受体阴性乳腺癌发病率),并模拟了20种代表性癌症的操作特性。截断模型平均方法始终优于任何固定核函数。我们的结果支持在癌症的描述性研究中常规使用截断模型平均方法。
Lexis diagrams are rectangular arrays of event rates indexed by age and period. Analysis of Lexis diagrams is a cornerstone of cancer surveillance research. Typically, population-based descriptive studies analyze multiple Lexis diagrams defined by sex, tumor characteristics, race/ethnicity, geographic region, etc. Inevitably the amount of information per Lexis diminishes with increasing stratification. Several methods have been proposed to smooth observed Lexis diagrams up front to clarify salient patterns and improve summary estimates of averages, gradients, and trends. In this article, we develop a novel bivariate kernel-based smoother that incorporates two key innovations. First, for any given kernel, we calculate its singular values decomposition, and select an optimal truncation point-the number of leading singular vectors to retain-based on the bias-corrected Akaike information criterion. Second, we model-average over a panel of candidate kernels with diverse shapes and bandwidths. The truncated model averaging approach is fast, automatic, has excellent performance, and provides a variance-covariance matrix that takes model selection into account. We present an in-depth case study (invasive estrogen receptor-negative breast cancer incidence among non-Hispanic white women in the United States) and simulate operating characteristics for 20 representative cancers. The truncated model averaging approach consistently outperforms any fixed kernel. Our results support the routine use of the truncated model averaging approach in descriptive studies of cancer.