从错误中学习:对组织学组织分类进行负面培训和课程查询的积极学习
Learning From Incorrectness: Active Learning With Negative Pre-Training and Curriculum Querying for Histological Tissue Classification
影响因子:9.80000
分区:医学1区 Top / 计算机:跨学科应用1区 工程:生物医学1区 工程:电子与电气1区 成像科学与照相技术1区
发表日期:2024 Feb
作者:
Wentao Hu, Lianglun Cheng, Guoheng Huang, Xiaochen Yuan, Guo Zhong, Chi-Man Pun, Jian Zhou, Muyan Cai
摘要
斑块级的组织学组织分类是一种有效的组织学幻灯片分析的预处理方法。但是,用深度学习的组织分类需要昂贵的注释成本。为了减轻注释预算的局限性,主动学习(AL)在组织学组织分类中的应用是一个有前途的解决方案。然而,应用过程中类别之间的性能存在很大的不平衡,与表现相对不足的类别相对应的组织对于癌症诊断同样重要。在本文中,我们提出了一个称为ICAL的主动学习框架,该框架包含不正确的负面预培训(INP)和类别的审判查询(CCQ),以分别从类别到类别的角度以及类别本身的角度来解决上述问题。特别是,INP结合了主动学习的独特机制,以治疗从CCQ获得的不正确预测结果作为负训练的互补标签,以便在训练过程中更好地区分相似类别。 CCQ通过INP训练的模型根据每个类别的学习状态调整查询权重,并利用不确定性评估和补偿由于类别性能不足而引起的查询偏差。对两个组织学组织分类数据集的实验结果表明,ICAL可以实现绩效,即具有不到16%的标记数据的完全监督学习的绩效。与最先进的积极学习算法相比,ICAL在所有类别中都取得了更好,更平衡的性能,并以极低的注释预算保持了鲁棒性。源代码将在https://github.com/lactorhwt/ical上发布。
Abstract
Patch-level histological tissue classification is an effective pre-processing method for histological slide analysis. However, the classification of tissue with deep learning requires expensive annotation costs. To alleviate the limitations of annotation budgets, the application of active learning (AL) to histological tissue classification is a promising solution. Nevertheless, there is a large imbalance in performance between categories during application, and the tissue corresponding to the categories with relatively insufficient performance are equally important for cancer diagnosis. In this paper, we propose an active learning framework called ICAL, which contains Incorrectness Negative Pre-training (INP) and Category-wise Curriculum Querying (CCQ) to address the above problem from the perspective of category-to-category and from the perspective of categories themselves, respectively. In particular, INP incorporates the unique mechanism of active learning to treat the incorrect prediction results that obtained from CCQ as complementary labels for negative pre-training, in order to better distinguish similar categories during the training process. CCQ adjusts the query weights based on the learning status on each category by the model trained by INP, and utilizes uncertainty to evaluate and compensate for query bias caused by inadequate category performance. Experimental results on two histological tissue classification datasets demonstrate that ICAL achieves performance approaching that of fully supervised learning with less than 16% of the labeled data. In comparison to the state-of-the-art active learning algorithms, ICAL achieved better and more balanced performance in all categories and maintained robustness with extremely low annotation budgets. The source code will be released at https://github.com/LactorHwt/ICAL.