研究动态
Articles below are published ahead of final publication in an issue. Please cite articles in the following format: authors, (year), title, journal, DOI.

VIT-B/16 中基于相对位置编码和残余 MLP 的脑肿瘤分类。

Brain tumor classification in VIT-B/16 based on relative position encoding and residual MLP.

发表日期:2024
作者: Shuang Hong, Jin Wu, Lei Zhu, Weijie Chen
来源: Brain Structure & Function

摘要:

脑肿瘤对健康构成重大威胁,其早期检测和分类至关重要。目前,诊断严重依赖病理学家对大脑图像进行耗时的形态学检查,导致主观结果和潜在的误诊。针对这些挑战,本研究提出了一种改进的基于 Vision Transformer 的人脑肿瘤分类算法。为了克服现有数据集较小的限制,应用同态过滤、通道对比度有限自适应直方图均衡和锐化掩蔽技术来丰富数据集图像,增强信息并提高模型泛化能力。针对 Vision Transformer 自注意力结构在捕获输入标记序列方面的局限性,采用一种新颖的相对位置编码方法来增强模型的整体预测能力。此外,在多层感知器中引入残差结构可以解决训练期间的收敛退化问题,从而实现更快的收敛并提高算法精度。最后,本研究全面分析了网络模型在验证集上的准确率、精确率和召回率方面的表现。实验结果表明,该模型在增强的开源脑肿瘤数据集上实现了 91.36% 的分类准确率,比原始 VIT-B/16 准确率提高了 5.54%。这验证了所提出的脑肿瘤分类方法的有效性,为医生的临床诊断提供潜在的参考。版权所有:© 2024 Hong et al.这是一篇根据知识共享署名许可条款分发的开放获取文章,允许在任何媒体上不受限制地使用、分发和复制,前提是注明原始作者和来源。
Brain tumors pose a significant threat to health, and their early detection and classification are crucial. Currently, the diagnosis heavily relies on pathologists conducting time-consuming morphological examinations of brain images, leading to subjective outcomes and potential misdiagnoses. In response to these challenges, this study proposes an improved Vision Transformer-based algorithm for human brain tumor classification. To overcome the limitations of small existing datasets, Homomorphic Filtering, Channels Contrast Limited Adaptive Histogram Equalization, and Unsharp Masking techniques are applied to enrich dataset images, enhancing information and improving model generalization. Addressing the limitation of the Vision Transformer's self-attention structure in capturing input token sequences, a novel relative position encoding method is employed to enhance the overall predictive capabilities of the model. Furthermore, the introduction of residual structures in the Multi-Layer Perceptron tackles convergence degradation during training, leading to faster convergence and enhanced algorithm accuracy. Finally, this study comprehensively analyzes the network model's performance on validation sets in terms of accuracy, precision, and recall. Experimental results demonstrate that the proposed model achieves a classification accuracy of 91.36% on an augmented open-source brain tumor dataset, surpassing the original VIT-B/16 accuracy by 5.54%. This validates the effectiveness of the proposed approach in brain tumor classification, offering potential reference for clinical diagnoses by medical practitioners.Copyright: © 2024 Hong et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.