研究动态
Articles below are published ahead of final publication in an issue. Please cite articles in the following format: authors, (year), title, journal, DOI.

预测不同种族亚群癌症死亡率的公平性。

Fairness in Predicting Cancer Mortality Across Racial Subgroups.

发表日期:2024 Jul 01
作者: Teja Ganta, Arash Kia, Prathamesh Parchure, Min-Heng Wang, Melanie Besculides, Madhu Mazumdar, Cardinale B Smith
来源: JAMA Network Open

摘要:

机器学习有潜力帮助临床医生优先考虑患者进行严重疾病对话,从而改变癌症护理。然而,需要评估模型是否存在跨种族群体的不平等表现(即种族偏见),以便现有的种族差异不会加剧。 评估识别患者 180 天癌症死亡风险的预测机器学习模型中是否存在种族偏见在这项队列研究中,利用西奈山卫生系统的回顾性数据,通过随机森林算法开发了一个机器学习模型,用于预测 2016 年 1 月至 2021 年 12 月期间诊断出患有癌症的 21 岁或以上患者的癌症死亡率截至访问数据库进行队列提取的日期(2022 年 2 月)的癌症登记、社会保障死亡指数和电子健康记录。种族类别。主要结果是模型歧视性表现(受试者工作特征曲线下面积 [AUROC]、每个种族类别(亚洲人、黑人、美洲原住民、白人和其他或未知)中的 F1 分数)以及种族类别的每对比较中的公平性指标(平等机会、均等赔率和不同影响)。真阳性比率代表机会均等;真阳性率和假阳性率比率,均等赔率;以及预测阳性率的百分比,影响不同。所有指标均以比例或比率的形式进行估计,并通过 95% CI 捕获变异性。该模型临床使用的预设标准是不同种族群体之间公平性指标至少为 80% 的阈值,以确保模型的预测不会对任何特定种族产生偏见。测试验证数据集包括 43274 名人口统计均衡的患者。平均 (SD) 年龄为 64.09 (14.26) 岁,其中 49.6% 年龄超过 65 岁。女性占53.3%; 9.5%,亚洲人; 18.9%,黑色; 0.1%,美洲原住民; 52.2%,白色; 19.2%,其他或未知种族; 0.1% 的人缺少比赛数据。共有88.9%的患者存活,11.1%死亡。种族亚组之间的 AUROC、F1 评分和公平性指标保持合理的一致性:亚洲患者的 AUROC 范围为 0.75(95% CI,0.72-0.78),黑人患者的 AUROC 范围为 0.75(95% CI,0.73-0.77)至 0.77对于其他或未知种族的患者,(95% CI,0.75-0.79); F1 评分,从白人患者的 0.32(95% CI,0.32-0.33)到黑人患者的 0.40(95% CI,0.39-0.42);平等机会比,从黑人患者与白人患者相比的 0.96(95% CI,0.95-0.98)到黑人患者与其他或未知种族患者相比的 1.02(95% CI,1.00-1.04);均衡比值比,从黑人患者与白人患者相比的 0.87(95% CI,0.85-0.92)到黑人患者与其他或未知种族患者相比的 1.16(1.10-1.21);和不同的影响比,从黑人患者与白人患者相比的 0.86(95% CI,0.82-0.89)到黑人患者与其他或未知种族患者相比的 1.17(95% CI,1.12-1.22)。 在这项队列研究中,表现或公平性指标缺乏显着差异表明不存在种族偏见,这表明该模型公平地识别了跨种族群体的癌症死亡风险。持续审查该模型在临床环境中的应用仍然至关重要,以确保公平的患者护理。
Machine learning has potential to transform cancer care by helping clinicians prioritize patients for serious illness conversations. However, models need to be evaluated for unequal performance across racial groups (ie, racial bias) so that existing racial disparities are not exacerbated.To evaluate whether racial bias exists in a predictive machine learning model that identifies 180-day cancer mortality risk among patients with solid malignant tumors.In this cohort study, a machine learning model to predict cancer mortality for patients aged 21 years or older diagnosed with cancer between January 2016 and December 2021 was developed with a random forest algorithm using retrospective data from the Mount Sinai Health System cancer registry, Social Security Death Index, and electronic health records up to the date when databases were accessed for cohort extraction (February 2022).Race category.The primary outcomes were model discriminatory performance (area under the receiver operating characteristic curve [AUROC], F1 score) among each race category (Asian, Black, Native American, White, and other or unknown) and fairness metrics (equal opportunity, equalized odds, and disparate impact) among each pairwise comparison of race categories. True-positive rate ratios represented equal opportunity; both true-positive and false-positive rate ratios, equalized odds; and the percentage of predictive positive rate ratios, disparate impact. All metrics were estimated as a proportion or ratio, with variability captured through 95% CIs. The prespecified criterion for the model's clinical use was a threshold of at least 80% for fairness metrics across different racial groups to ensure the model's prediction would not be biased against any specific race.The test validation dataset included 43 274 patients with balanced demographics. Mean (SD) age was 64.09 (14.26) years, with 49.6% older than 65 years. A total of 53.3% were female; 9.5%, Asian; 18.9%, Black; 0.1%, Native American; 52.2%, White; and 19.2%, other or unknown race; 0.1% had missing race data. A total of 88.9% of patients were alive, and 11.1% were dead. The AUROCs, F1 scores, and fairness metrics maintained reasonable concordance among the racial subgroups: the AUROCs ranged from 0.75 (95% CI, 0.72-0.78) for Asian patients and 0.75 (95% CI, 0.73-0.77) for Black patients to 0.77 (95% CI, 0.75-0.79) for patients with other or unknown race; F1 scores, from 0.32 (95% CI, 0.32-0.33) for White patients to 0.40 (95% CI, 0.39-0.42) for Black patients; equal opportunity ratios, from 0.96 (95% CI, 0.95-0.98) for Black patients compared with White patients to 1.02 (95% CI, 1.00-1.04) for Black patients compared with patients with other or unknown race; equalized odds ratios, from 0.87 (95% CI, 0.85-0.92) for Black patients compared with White patients to 1.16 (1.10-1.21) for Black patients compared with patients with other or unknown race; and disparate impact ratios, from 0.86 (95% CI, 0.82-0.89) for Black patients compared with White patients to 1.17 (95% CI, 1.12-1.22) for Black patients compared with patients with other or unknown race.In this cohort study, the lack of significant variation in performance or fairness metrics indicated an absence of racial bias, suggesting that the model fairly identified cancer mortality risk across racial groups. It remains essential to consistently review the model's application in clinical settings to ensure equitable patient care.