研究动态
Articles below are published ahead of final publication in an issue. Please cite articles in the following format: authors, (year), title, journal, DOI.

使用高危人类乳头瘤病毒感染的女性终点型病毒基因型在机器学习诊断预测模型中的应用

Use of Virus Genotypes in Machine Learning Diagnostic Prediction Models for Cervical Cancer in Women With High-Risk Human Papillomavirus Infection.

发表日期:2023 Aug 01
作者: Ting Xiao, Chunhua Wang, Mei Yang, Jun Yang, Xiaohan Xu, Liang Shen, Zhou Yang, Hui Xing, Chun-Quan Ou
来源: Disease Models & Mechanisms

摘要:

高危型人乳头瘤病毒(hrHPV)被认为是宫颈癌的病因因素, 最近世界卫生组织的指南推荐hrHPV DNA测试作为宫颈癌筛查的首选方法。宫颈癌预测模型可能对筛查和监测非常有用,特别是在低资源环境中,无法得到细胞学和阴道镜检查结果时。但之前的研究并未包括hrHPV感染的女性。本研究的目的是开发和验证一种宫颈癌预测模型,其中包括hrHPV感染阳性的女性,并检查是否包含HPV基因型能改善宫颈癌的预测能力。本诊断性研究收集了来自中国136家一级医疗中心的314,587名女性的诊断数据,收集时间为2017年1月15日至2018年2月28日。该数据集在地理上分为六个区域,开发模型的数据来自100个一级保健中心(训练数据集),验证模型的数据来自3个区域的36个保健中心。研究共纳入了24,391名在宫颈癌筛查计划中hrHPV检测结果阳性的女性。数据分析时间为2022年1月1日至2022年7月14日。主要结果是宫颈上皮内瘤变3级或更严重(CIN3+), 次要结果为宫颈上皮内瘤变2级或更严重(CIN2+)。通过使用受试ROC曲线下面积(AUROC)、灵敏度、特异度、阳性似然比和阴性似然比来评估预测模型对CIN3+和CIN2+的判别能力。使用校准图和决策曲线分别评估模型的校准和临床效用。在排除掉没有筛查结果的女性后,研究纳入了21,720名女性(年龄的中位数[IQR]为50 [44-55]岁)。在训练数据集的14,553名女性中,349人(2.4%)诊断为CIN3+,673人(4.6%)诊断为CIN2+。在验证集的7,167名女性中,167人(2.3%)诊断为CIN3+,228人(3.2%)诊断为CIN2+。将HPV基因型纳入模型后,CIN3+的AUROC提高了35.9%,CIN2+的AUROC提高了41.7%。结合HPV基因型、流行病学因素和盆腔检查作为预测因素的叠加模型对CIN3+的预测具有0.87的AUROC (95% CI, 0.84-0.90),灵敏度为80.1%,特异度为83.4%,阳性似然比为4.83,阴性似然比为0.24。对于预测CIN2+的模型,AUROC为0.85 (95% CI, 0.82-0.88),灵敏度为80.4%,特异度为81.0%,阳性似然比为4.23,阴性似然比为0.24。决策曲线分析显示,当CIN3+的临床决策阈值低于23%时,叠加模型提供了更高的标准化净益;当CIN2+的临床决策阈值低于17%时,叠加模型提供了更高的标准化净益。本诊断性研究发现,包括HPV基因型显著提高了叠加模型对于hrHPV感染阳性女性的宫颈癌预测能力。这个预测模型可以在低资源环境中作为筛查和监测宫颈癌的重要工具。
High-risk human papillomavirus (hrHPV) is recognized as an etiologic agent for cervical cancer, and hrHPV DNA testing is recommended as the preferred method of cervical cancer screening in recent World Health Organization guidelines. Cervical cancer prediction models may be useful for screening and monitoring, particularly in low-resource settings with unavailable cytological and colposcopic examination results, but previous studies did not include women infected with hrHPV.To develop and validate a cervical cancer prediction model that includes women positive for hrHPV infection and examine whether the inclusion of HPV genotypes improves the cervical cancer prediction ability.This diagnostic study included diagnostic data from 314 587 women collected from 136 primary care centers in China between January 15, 2017, and February 28, 2018. The data set was separated geographically into data from 100 primary care centers in 6 districts for model development (training data set) and 36 centers in 3 districts for model validation. A total of 24 391 women identified with positive hrHPV test results in the cervical cancer screening program were included in the study. Data were analyzed from January 1, 2022, to July 14, 2022.Cervical intraepithelial neoplasia grade 3 or worse (CIN3+) was the primary outcome, and cervical intraepithelial neoplasia grade 2 or worse (CIN2+) was the secondary outcome. The ability of the prediction models to discriminate CIN3+ and CIN2+ was evaluated using the area under the receiver operating characteristic curve (AUROC), sensitivity, specificity, positive likelihood ratio, and negative likelihood ratio. The calibration and clinical utility of the models were assessed using calibration plots and decision curves, respectively.After excluding women without screening outcomes, the study included 21 720 women (median [IQR] age, 50 [44-55] years). Of 14 553 women in the training data set, 349 (2.4%) received a diagnosis of CIN3+ and 673 (4.6%) of CIN2+. Of 7167 women in the validation set, 167 (2.3%) received a diagnosis of CIN3+ and 228 (3.2%) of CIN2+. Including HPV genotype in the model improved the AUROC by 35.9% for CIN3+ and 41.7% for CIN2+. With HPV genotype, epidemiological factors, and pelvic examination as predictors, the stacking model had an AUROC of 0.87 (95% CI, 0.84-0.90) for predicting CIN3+. The sensitivity was 80.1%, specificity was 83.4%, positive likelihood ratio was 4.83, and negative likelihood ratio was 0.24. The model for predicting CIN2+ had an AUROC of 0.85 (95% CI, 0.82-0.88), with a sensitivity of 80.4%, specificity of 81.0%, positive likelihood ratio of 4.23, and negative likelihood ratio of 0.24. The decision curve analysis indicated that the stacking model provided a superior standardized net benefit when the threshold probability for clinical decision was lower than 23% for CIN3+ and lower than 17% for CIN2+.This diagnostic study found that inclusion of HPV genotypes markedly improved the ability of a stacking model to predict cervical cancer among women who tested positive for hrHPV infection. This prediction model may be an important tool for screening and monitoring cervical cancer, particularly in low-resource settings.