使用人工智能在 CT 肺癌筛查中进行结节和癌症检测的软件:测试准确性研究的系统评价。
Software using artificial intelligence for nodule and cancer detection in CT lung cancer screening: systematic review of test accuracy studies.
发表日期:2024 Oct 16
作者:
Julia Geppert, Asra Asgharzadeh, Anna Brown, Chris Stinton, Emma J Helm, Surangi Jayakody, Daniel Todkill, Daniel Gallacher, Hesam Ghiasvand, Mubarak Patel, Peter Auguste, Alexander Tsertsvadze, Yen-Fu Chen, Amy Grove, Bethany Shinkins, Aileen Clarke, Sian Taylor-Phillips
来源:
THORAX
摘要:
为了检查人工智能 (AI) 软件辅助 CT 肺癌筛查的准确性和影响。对具有 CE 标记、基于 AI 的软件进行了系统评价,该软件用于 CT 肺癌筛查中结节的自动检测和分析。从 2012 年到 2023 年 3 月,检索了包括 Medline、Embase 和 Cochrane CENTRAL 在内的多个数据库。其中包括了初步研究报告的测试准确性或对阅读时间或临床管理的影响。 QUADAS-2 和 QUADAS-C 用于评估偏倚风险。我们进行了叙述综合。有 11 项研究评估了六种不同的基于人工智能的软件并报告了 19 770 名患者,这些研究符合资格。所有这些都存在高偏倚风险,并存在多种适用性问题。与独立判读相比,人工智能辅助判读速度更快,灵敏度普遍提高(可操作结节的检测/分类为 5% 至 20%;恶性结节的检测/分类为 3% 至 15%),但特异性较低(-7% 至 - 3% 用于正确检测/分类没有可操作结节的人;-8% 至 -6% 用于正确检测/分类没有恶性结节的人)。人工智能援助往往会增加分配给较高风险类别的结节的比例。假设癌症患病率为 0.5%,这些结果将意味着每百万参加筛查的人会额外检测到 150-750 种癌症,但会导致额外 59,700 至 79,600 人参加筛查,而无需接受不必要的 CT 监测。人工智能对肺癌筛查的帮助可能会有所改善敏感性但增加了假阳性结果和不必要的监测的数量。未来的研究需要通过改进研究设计来提高人工智能辅助阅读的特异性,并最大限度地减少偏倚风险和适用性问题。CRD42021298449.© 作者(或其雇主)2024。CC BY 允许重复使用。英国医学杂志出版。
To examine the accuracy and impact of artificial intelligence (AI) software assistance in lung cancer screening using CT.A systematic review of CE-marked, AI-based software for automated detection and analysis of nodules in CT lung cancer screening was conducted. Multiple databases including Medline, Embase and Cochrane CENTRAL were searched from 2012 to March 2023. Primary research reporting test accuracy or impact on reading time or clinical management was included. QUADAS-2 and QUADAS-C were used to assess risk of bias. We undertook narrative synthesis.Eleven studies evaluating six different AI-based software and reporting on 19 770 patients were eligible. All were at high risk of bias with multiple applicability concerns. Compared with unaided reading, AI-assisted reading was faster and generally improved sensitivity (+5% to +20% for detecting/categorising actionable nodules; +3% to +15% for detecting/categorising malignant nodules), with lower specificity (-7% to -3% for correctly detecting/categorising people without actionable nodules; -8% to -6% for correctly detecting/categorising people without malignant nodules). AI assistance tended to increase the proportion of nodules allocated to higher risk categories. Assuming 0.5% cancer prevalence, these results would translate into additional 150-750 cancers detected per million people attending screening but lead to an additional 59 700 to 79 600 people attending screening without cancer receiving unnecessary CT surveillance.AI assistance in lung cancer screening may improve sensitivity but increases the number of false-positive results and unnecessary surveillance. Future research needs to increase the specificity of AI-assisted reading and minimise risk of bias and applicability concerns through improved study design.CRD42021298449.© Author(s) (or their employer(s)) 2024. Re-use permitted under CC BY. Published by BMJ.