研究动态
Articles below are published ahead of final publication in an issue. Please cite articles in the following format: authors, (year), title, journal, DOI.

利用容易出错的算法衍生的表型:加强 EHR 数据中风险因素的关联研究。

Leveraging error-prone algorithm-derived phenotypes: Enhancing association studies for risk factors in EHR data.

发表日期:2024 Jul 12
作者: Yiwen Lu, Jiayi Tong, Jessica Chubak, Thomas Lumley, Rebecca A Hubbard, Hua Xu, Yong Chen
来源: JOURNAL OF BIOMEDICAL INFORMATICS

摘要:

针对给定表型从电子健康记录 (EHR) 中开发多个可计算表型已变得越来越普遍。然而,基于电子病历的关联研究通常关注单一表型。在本文中,我们开发了一种方法,旨在同时利用多个 EHR 衍生的表型,以减少由于表型错误引起的偏差,并提高表型/暴露关联的效率。该方法将多个算法衍生的表型与一小组验证结果以减少偏差并提高估计准确性和效率。我们的方法的性能是通过模拟研究和使用华盛顿凯萨永久医疗机构的 EHR 数据分析结肠癌复发的实际应用进行评估的。在没有单一替代者在敏感性和敏感性方面均优于所有其他替代者的环境中。特异性,与使用单一算法衍生的表型相比,我们的方法显着减少了偏差。与仅使用一种算法衍生的表型的估计器相比,我们的方法还使估计效率提高了 30%。模拟研究和对现实世界数据的应用证明了我们的方法在整合多种表型方面的有效性,从而增强了偏差减少、统计准确性和效率。我们的方法使用统计上有效的看似不相关的回归框架来组合多个代理的信息。我们的方法为基于单一替代项的偏差校正提供了一个强大的替代方案,特别是在缺乏替代项更优信息的情况下。版权所有 © 2024。由 Elsevier Inc. 出版。
It has become increasingly common for multiple computable phenotypes from electronic health records (EHR) to be developed for a given phenotype. However, EHR-based association studies often focus on a single phenotype. In this paper, we develop a method aiming to simultaneously make use of multiple EHR-derived phenotypes for reduction of bias due to phenotyping error and improved efficiency of phenotype/exposure associations.The proposed method combines multiple algorithm-derived phenotypes with a small set of validated outcomes to reduce bias and improve estimation accuracy and efficiency. The performance of our method was evaluated through simulation studies and real-world application to an analysis of colon cancer recurrence using EHR data from Kaiser Permanente Washington.In settings where there was no single surrogate performing uniformly better than all others in terms of both sensitivity and specificity, our method achieved substantial bias reduction compared to using a single algorithm-derived phenotype. Our method also led to higher estimation efficiency by up to 30% compared to an estimator that used only one algorithm-derived phenotype.Simulation studies and application to real-world data demonstrated the effectiveness of our method in integrating multiple phenotypes, thereby enhancing bias reduction, statistical accuracy and efficiency.Our method combines information across multiple surrogates using a statistically efficient seemingly unrelated regression framework. Our method provides a robust alternative to single-surrogate-based bias correction, especially in contexts lacking information on which surrogate is superior.Copyright © 2024. Published by Elsevier Inc.