医疗保健领域基于法学硕士的混合变压器诊断系统。
A LLM-Based Hybrid-Transformer Diagnosis System in Healthcare.
发表日期:2024 Oct 16
作者:
Dongyuan Wu, Liming Nie, Rao Asad Mumtaz, Kadambri Agarwal
来源:
IEEE Journal of Biomedical and Health Informatics
摘要:
计算机视觉驱动的大语言模型(LLM)在医学图像诊断中的应用显着先进了医疗保健系统。开发对称架构的最新进展极大地影响了各种医学成像任务。虽然 CNN 或 RNN 已表现出出色的性能,但这些架构通常面临详细信息大量丢失的显着限制,例如需要有效捕获全局语义信息并严重依赖深度编码器和积极的下采样。本文介绍了一种新颖的基于 LLM 的混合变压器网络(HybridTransNet),旨在使用变压器机制对标记化大数据补丁进行编码,该网络优雅地嵌入不同大小的多模态数据作为 LLMS 的标记序列输入。随后,网络执行尺度间和尺度内的自注意力,通过带有精炼模块的基于变压器的对称架构处理数据特征,这有助于准确地恢复局部和全局上下文信息。此外,还使用新颖的模糊选择器对输出进行了细化。与两个不同数据集上的其他现有方法相比,实验结果和正式评估表明,我们基于法学硕士的 HybridTransNet 为医疗信息学中的脑肿瘤诊断提供了卓越的性能。
The application of computer vision-powered large language models (LLMs) for medical image diagnosis has significantly advanced healthcare systems. Recent progress in developing symmetrical architectures has greatly impacted various medical imaging tasks. While CNNs or RNNs have demonstrated excellent performance, these architectures often face notable limitations of substantial losses in detailed information, such as requiring to capture global semantic information effectively and relying heavily on deep encoders and aggressive downsampling. This paper introduces a novel LLM-based Hybrid-Transformer Network (HybridTransNet) designed to encode tokenized Big Data patches with the transformer mechanism, which elegantly embeds multimodal data of varying sizes as token sequence inputs of LLMS. Subsequently, the network performs both inter-scale and intra-scale self-attention, processing data features through a transformer-based symmetric architecture with a refining module, which facilitates accurately recovering both local and global context information. Additionally, the output is refined using a novel fuzzy selector. Compared to other existing methods on two distinct datasets, the experimental findings and formal assessment demonstrate that our LLM-based HybridTransNet provides superior performance for brain tumor diagnosis in healthcare informatics.