研究动态
Articles below are published ahead of final publication in an issue. Please cite articles in the following format: authors, (year), title, journal, DOI.

人工智能协助初级放射科医生进行甲状腺超声的临床价值:来自真实临床实践的多中心前瞻性研究。

The clinical value of artificial intelligence in assisting junior radiologists in thyroid ultrasound: a multicenter prospective study from real clinical practice.

发表日期:2024 Jul 12
作者: Dong Xu, Lin Sui, Chunquan Zhang, Jing Xiong, Vicky Yang Wang, Yahan Zhou, Xinying Zhu, Chen Chen, Yu Zhao, Yiting Xie, Weizhen Kong, Jincao Yao, Lei Xu, Yuxia Zhai, Liping Wang
来源: BMC Medicine

摘要:

本研究旨在提出一种临床适用的甲状腺结节分析2梯队(2e)诊断标准,筛选出低风险结节,仅通过组织病理学进一步检查可疑或不确定的结节,并探讨人工智能是否可以( AI)可以在现实世界的前瞻性场景中为临床决策提供精准辅助。在这项前瞻性研究中,我们入组了来自三个医疗中心的 1036 名患者,共计 2296 个甲状腺结节。根据我们提出的2e诊断标准,对人工智能系统的诊断性能、不同经验水平的放射科医生以及不同经验水平的人工智能辅助放射科医生诊断甲状腺结节的性能进行评估,第一个是由3名高级专家组成的仲裁委员会二是细胞病理学或组织病理学。根据2e诊断标准,仲裁委员会对1543个结节进行了分类,其中753个结节通过病理检查确定了良恶性。以病理结果为评价标准,AI系统的敏感性、特异性、准确性和受试者工作特征曲线下面积(AUC)分别为0.826、0.815、0.821和0.821。以仲裁委员会诊断为评价标准的案件,AI系统的敏感性、特异性、准确性和AUC分别为0.946、0.966、0.964和0.956。以全球2e诊断标准为金标准,AI系统的敏感性、特异性、准确性和AUC分别为0.868、0.934、0.917和0.901。在不同标准下,AI的诊断表现与高级放射科医生相当,并且优于初级放射科医生(均P<0.05)。此外,人工智能辅助显着提高了初级放射科医生诊断甲状腺结节的表现,以病理结果为金标准时,其诊断表现与高级放射科医生相当(均p>0.05)。提出的2e诊断标准与现实世界的临床评估一致,证实了人工智能系统的适用性。在2e标准下,AI系统的诊断性能与高级放射科医生的诊断性能相当,并显着提高初级放射科医生的诊断能力。这有可能减少现实临床实践中不必要的侵入性诊断程序。© 2024。作者。
This study is to propose a clinically applicable 2-echelon (2e) diagnostic criteria for the analysis of thyroid nodules such that low-risk nodules are screened off while only suspicious or indeterminate ones are further examined by histopathology, and to explore whether artificial intelligence (AI) can provide precise assistance for clinical decision-making in the real-world prospective scenario.In this prospective study, we enrolled 1036 patients with a total of 2296 thyroid nodules from three medical centers. The diagnostic performance of the AI system, radiologists with different levels of experience, and AI-assisted radiologists with different levels of experience in diagnosing thyroid nodules were evaluated against our proposed 2e diagnostic criteria, with the first being an arbitration committee consisting of 3 senior specialists and the second being cyto- or histopathology.According to the 2e diagnostic criteria, 1543 nodules were classified by the arbitration committee, and the benign and malignant nature of 753 nodules was determined by pathological examinations. Taking pathological results as the evaluation standard, the sensitivity, specificity, accuracy, and area under the receiver operating characteristic curve (AUC) of the AI systems were 0.826, 0.815, 0.821, and 0.821. For those cases where diagnosis by the Arbitration Committee were taken as the evaluation standard, the sensitivity, specificity, accuracy, and AUC of the AI system were 0.946, 0.966, 0.964, and 0.956. Taking the global 2e diagnostic criteria as the gold standard, the sensitivity, specificity, accuracy, and AUC of the AI system were 0.868, 0.934, 0.917, and 0.901, respectively. Under different criteria, AI was comparable to the diagnostic performance of senior radiologists and outperformed junior radiologists (all P < 0.05). Furthermore, AI assistance significantly improved the performance of junior radiologists in the diagnosis of thyroid nodules, and their diagnostic performance was comparable to that of senior radiologists when pathological results were taken as the gold standard (all p > 0.05).The proposed 2e diagnostic criteria are consistent with real-world clinical evaluations and affirm the applicability of the AI system. Under the 2e criteria, the diagnostic performance of the AI system is comparable to that of senior radiologists and significantly improves the diagnostic capabilities of junior radiologists. This has the potential to reduce unnecessary invasive diagnostic procedures in real-world clinical practice.© 2024. The Author(s).