从错误中学习:基于负面预训练与课程查询的主动学习用于组织组织分类
Learning From Incorrectness: Active Learning With Negative Pre-Training and Curriculum Querying for Histological Tissue Classification
DOI 原文链接
用sci-hub下载
如无法下载,请从 Sci-Hub 选择可用站点尝试。
影响因子:9.8
分区:医学1区 Top / 计算机:跨学科应用1区 工程:生物医学1区 工程:电子与电气1区 成像科学与照相技术1区
发表日期:2024 Feb
作者:
Wentao Hu, Lianglun Cheng, Guoheng Huang, Xiaochen Yuan, Guo Zhong, Chi-Man Pun, Jian Zhou, Muyan Cai
DOI:
10.1109/TMI.2023.3313509
摘要
像素级组织组织分类是组织切片分析中的一种有效预处理方法。然而,利用深度学习进行组织分类需要昂贵的标注成本。为了缓解标注预算的限制,应用主动学习(AL)于组织分类是一个有前景的解决方案。然而,在实际应用中,不同类别之间的性能存在较大不平衡,性能较差类别对应的组织对于癌症诊断同样重要。本文提出了名为ICAL的主动学习框架,包含错误识别负面预训练(INP)和类别级课程查询(CCQ),分别从类别对类别和类别本身的角度解决上述问题。特别地,INP结合主动学习的独特机制,将由CCQ获得的错误预测结果作为互补标签进行负面预训练,以更好地区分相似类别。CCQ根据由INP训练的模型在每个类别上的学习状态调整查询权重,并利用不确定性评估和补偿因类别性能不足引起的查询偏差。在两个组织组织分类数据集上的实验结果表明,ICAL在标注数据少于16%的条件下,性能接近全监督学习。与现有最先进的主动学习算法相比,ICAL在所有类别中实现了更好、更平衡的性能,并在极低的标注预算下保持鲁棒性。源代码将于https://github.com/LactorHwt/ICAL公布。
Abstract
Patch-level histological tissue classification is an effective pre-processing method for histological slide analysis. However, the classification of tissue with deep learning requires expensive annotation costs. To alleviate the limitations of annotation budgets, the application of active learning (AL) to histological tissue classification is a promising solution. Nevertheless, there is a large imbalance in performance between categories during application, and the tissue corresponding to the categories with relatively insufficient performance are equally important for cancer diagnosis. In this paper, we propose an active learning framework called ICAL, which contains Incorrectness Negative Pre-training (INP) and Category-wise Curriculum Querying (CCQ) to address the above problem from the perspective of category-to-category and from the perspective of categories themselves, respectively. In particular, INP incorporates the unique mechanism of active learning to treat the incorrect prediction results that obtained from CCQ as complementary labels for negative pre-training, in order to better distinguish similar categories during the training process. CCQ adjusts the query weights based on the learning status on each category by the model trained by INP, and utilizes uncertainty to evaluate and compensate for query bias caused by inadequate category performance. Experimental results on two histological tissue classification datasets demonstrate that ICAL achieves performance approaching that of fully supervised learning with less than 16% of the labeled data. In comparison to the state-of-the-art active learning algorithms, ICAL achieved better and more balanced performance in all categories and maintained robustness with extremely low annotation budgets. The source code will be released at https://github.com/LactorHwt/ICAL.