具有卷积和 Transformer 的 3D 边界引导混合网络，用于 CT 图像中的肺部肿瘤分割。

A 3D boundary-guided hybrid network with convolutions and Transformers for lung tumor segmentation in CT images.

Original text

发表日期：2024 Aug 12

作者： Hong Liu, Yuzhou Zhuang, Enmin Song, Yongde Liao, Guanchao Ye, Fan Yang, Xiangyang Xu, Xvhao Xiao, Chih-Cheng Hung

来源： COMPUTERS IN BIOLOGY AND MEDICINE

摘要：

-计算机断层扫描 (CT) 扫描的准确肺部肿瘤分割对于肺癌诊断至关重要。由于 2D 方法缺乏肺部 CT 图像的体积信息，基于 3D 卷积和 Transformer 的方法最近已应用于使用 CT 成像的肺部肿瘤分割任务。然而，大多数现有的 3D 方法无法有效地将卷积学习的局部模式与 Transformers 捕获的全局依赖关系进行协作，并且广泛忽略了肺部肿瘤的重要边界信息。为了解决这些问题，我们提出了一种使用卷积和 Transformer 进行肺肿瘤分割的 3D 边界引导混合网络，名为 BGHNet。在 BGHNet 中，我们首先提出了在编码阶段具有并行卷积和 Transformer 分支的混合局部全局上下文聚合（HLGCA）模块。为了聚合 HLGCA 模块每个分支中的局部和全局上下文，我们不仅设计了体积交叉条纹窗口变换器（VCSwin-Transformer）来构建具有局部归纳偏差和大感受野的变换器分支，而且还设计了体积金字塔与基于变压器的扩展（VPConvNeXt）进行卷积，以构建具有多尺度全局信息的卷积分支。然后，我们在解码阶段提出了边界引导特征细化（BGFR）模块，该模块显式地利用边界信息来细化多阶段解码特征以获得更好的性能。在两个肺肿瘤分割数据集上进行了广泛的实验，包括私有数据集（HUST-Lung）和公共基准数据集（MSD-Lung）。结果表明，BGHNet 在我们的实验中优于其他最先进的 2D 或 3D 方法，并且在非对比和对比增强 CT 扫描中都表现出卓越的泛化性能。版权所有 © 2024 Elsevier Ltd。保留所有权利。

-Accurate lung tumor segmentation from Computed Tomography (CT) scans is crucial for lung cancer diagnosis. Since the 2D methods lack the volumetric information of lung CT images, 3D convolution-based and Transformer-based methods have recently been applied in lung tumor segmentation tasks using CT imaging. However, most existing 3D methods cannot effectively collaborate the local patterns learned by convolutions with the global dependencies captured by Transformers, and widely ignore the important boundary information of lung tumors. To tackle these problems, we propose a 3D boundary-guided hybrid network using convolutions and Transformers for lung tumor segmentation, named BGHNet. In BGHNet, we first propose the Hybrid Local-Global Context Aggregation (HLGCA) module with parallel convolution and Transformer branches in the encoding phase. To aggregate local and global contexts in each branch of the HLGCA module, we not only design the Volumetric Cross-Stripe Window Transformer (VCSwin-Transformer) to build the Transformer branch with local inductive biases and large receptive fields, but also design the Volumetric Pyramid Convolution with transformer-based extensions (VPConvNeXt) to build the convolution branch with multi-scale global information. Then, we present a Boundary-Guided Feature Refinement (BGFR) module in the decoding phase, which explicitly leverages the boundary information to refine multi-stage decoding features for better performance. Extensive experiments were conducted on two lung tumor segmentation datasets, including a private dataset (HUST-Lung) and a public benchmark dataset (MSD-Lung). Results show that BGHNet outperforms other state-of-the-art 2D or 3D methods in our experiments, and it exhibits superior generalization performance in both non-contrast and contrast-enhanced CT scans.Copyright © 2024 Elsevier Ltd. All rights reserved.