模式感知Transformer:序列医学图像中的分层模式传播。
Pattern-Aware Transformer: Hierarchical Pattern Propagation in Sequential Medical Images.
发表日期:2023 Aug 18
作者:
Lingyun Wu, Xiang Gao, Zhiqiang Hu, Shaoting Zhang
来源:
IEEE TRANSACTIONS ON MEDICAL IMAGING
摘要:
本文研究如何在医学影像任务中有效地挖掘顺序图像中的上下文信息,并共同对其进行建模。与用点对点的编码方法建模顺序相关性的最新方法不同,本文提出了一种全新的标记化策略,即层次化模式感知。该策略独立并分层处理不同的视觉模式,不仅保证了在不同模式表示下注意力聚合的完全灵活性,而且同时保留了局部和全局信息。基于此策略,我们提出了一种名为Pattern-Aware Transformer(PATrans)的方法,它具有全局-局部双路径模式感知交叉注意机制,以实现顺序图像之间的层次化模式匹配和传播。此外,PATrans是即插即用的,可以自然地嵌入到不同的骨干网络中,用于各种下游序列建模任务。我们在四个域和五个基准测试中提供了它的通用应用范例,分别是视频目标检测和3D体积语义分割任务。PATrans在所有这些基准测试中取得了最新的最佳效果,即CVC-Video(92.3%检测F1),ASU-Mayo(99.1%定位F1),Lung Tumor(78.59% DSC),Nasopharynx Tumor(75.50% DSC)和Kidney Tumor(87.53% DSC)。代码和模型可在 https://github.com/GGaoxiang/PATrans 中获取。
This paper investigates how to effectively mine contextual information among sequential images and jointly model them in medical imaging tasks. Different from state-of-the-art methods for modeling sequential correlations via point-wise token encoding, this paper develops a novel tokenization strategy, namely hierarchical pattern-aware. It handles different visual patterns independently and hierarchically, which not only ensures the full flexibility of attention aggregations under different pattern representations but also preserves local and global information at the same time. Based on this strategy, we propose a Pattern-Aware Transformer (PATrans) featuring a global-local dual-path pattern-aware cross-attention mechanism to achieve hierarchical pattern matching and propagation among sequential images. In addition, PATrans is plug-and-play and can be naturally embedded into different backbone networks for various downstream sequence modeling tasks. We provide its general application paradigm across four domains and five benchmarks in video object detection and 3D volumetric semantic segmentation tasks, respectively. PATrans sets new state-of-the-art on all these benchmarks, i.e., CVC-Video (92.3% detection F1), ASU-Mayo (99.1% localization F1), Lung Tumor (78.59% DSC), Nasopharynx Tumor (75.50% DSC), and Kidney Tumor (87.53% DSC). Code and models are available at https://github.com/GGaoxiang/PATrans.