研究动态
Articles below are published ahead of final publication in an issue. Please cite articles in the following format: authors, (year), title, journal, DOI.

在整个幻灯片图像中使用 Transformer 对鼻咽癌进行弱监督分类。

Weakly Supervised Classification for Nasopharyngeal Carcinoma with Transformer in Whole Slide Images.

发表日期:2024 Jul 03
作者: Ziwei Hu, Jianchao Wang, Qinquan Gao, Zhida Wu, Hanchuan Xu, Zhechen Guo, Jiawei Quan, Lihua Zhong, Ming Du, Tong Tong, Gang Chen
来源: IEEE Journal of Biomedical and Health Informatics

摘要:

鼻咽癌的病理检查是诊断、指导临床治疗和判断预后不可缺少的因素。传统且完全监督的 NPC 诊断算法需要在数十亿像素的整个幻灯片图像 (WSI) 上手动描绘感兴趣区域,然而,这是费力且经常存在偏差的。在本文中,我们提出了一种基于 Tokens-to-Token Vision Transformer (WS-T2T-ViT) 的弱监督框架,用于仅使用滑动级标签进行准确的 NPC 分类。平铺图像的标签继承自其幻灯片级别标签。具体来说,WS-T2T-ViT由多分辨率金字塔、T2T-ViT和多尺度注意力模块组成。多分辨率金字塔旨在模仿手动病理分析从粗到细的过程,以学习不同放大级别的特征。 T2T模块捕获局部和全局特征以克服全局信息的缺乏。多尺度注意力模块通过加权不同粒度级别的贡献来提高分类性能。在 802 名患者的 NPC 和 CAMELYON16 数据集上进行了广泛的实验。 WS-T2T-ViT 在 NPC 数据集上实现 NPC 分类的受试者工作特征曲线 (AUC) 下面积为 0.989。 CAMELYON16数据集的实验结果证明了WS-T2T-ViT在WSI级别分类中的鲁棒性和泛化性。
Pathological examination of nasopharyngeal carcinoma (NPC) is an indispensable factor for diagnosis, guiding clinical treatment and judging prognosis. Traditional and fully supervised NPC diagnosis algorithms require manual delineation of regions of interest on the gigapixel of whole slide images (WSIs), which however is laborious and often biased. In this paper, we propose a weakly supervised framework based on Tokens-to-Token Vision Transformer (WS-T2T-ViT) for accurate NPC classification with only a slide-level label. The label of tile images is inherited from their slide-level label. Specifically, WS-T2T-ViT is composed of the multi-resolution pyramid, T2T-ViT and multi-scale attention module. The multi-resolution pyramid is designed for imitating the coarse-to-fine process of manual pathological analysis to learn features from different magnification levels. The T2T module captures the local and global features to overcome the lack of global information. The multi-scale attention module improves classification performance by weighting the contributions of different granularity levels. Extensive experiments are performed on the 802-patient NPC and CAMELYON16 dataset. WS-T2T-ViT achieves an area under the receiver operating characteristic curve (AUC) of 0.989 for NPC classification on the NPC dataset. The experiment results of CAMELYON16 dataset demonstrate the robustness and generalizability of WS-T2T-ViT in WSI-level classification.