研究动态
Articles below are published ahead of final publication in an issue. Please cite articles in the following format: authors, (year), title, journal, DOI.

基于多模态特征融合的宫颈病变分类

Classification of cervical lesions based on multimodal features fusion.

发表日期:2024 May 10
作者: Jing Li, Peng Hu, Huayu Gao, Nanyan Shen, Keqin Hua
来源: COMPUTERS IN BIOLOGY AND MEDICINE

摘要:

宫颈癌是严重威胁全球女性健康的疾病,其癌变周期长、病因明确,早期筛查对于预防和治疗至关重要。基于复旦大学附属妇产科医院提供的数据集,建立了宫颈病变四级分类模型,包括正常、低度鳞状上皮内病变(LSIL)、高度鳞状上皮内病变(HSIL)和癌症(Ca)。 )被开发出来。考虑到数据集特点,为了充分利用研究数据并保证数据集规模,模型输入包括原始阴道镜图像和醋酸阴道镜图像、病变分割掩模、人乳头瘤病毒(HPV)、薄层细胞学检测(TCT)和年龄,但不包括碘图像与乙酸图像下的病变有显着重叠。首先,通过计算醋酸白不透明度引入原始图像和醋酸图像之间的变化信息,挖掘醋酸白厚度与病变等级之间的相关性。其次,利用病变分割掩模将病变位置和形状的先验知识引入分类模型中。最后,利用基于自注意力机制的跨模态特征融合模块将图像信息与临床文本信息融合,揭示特征相关性。基于本研究使用的数据集,所提出的模型与过去三年来的五个优秀模型进行了全面比较,表明所提出的模型具有优越的分类性能以及性能和复杂度之间更好的平衡。模块消融实验进一步证明提出的每个改进模块都可以独立提高模型性能。版权所有 © 2024 Elsevier Ltd. 保留所有权利。
Cervical cancer is a severe threat to women's health worldwide with a long cancerous cycle and a clear etiology, making early screening vital for the prevention and treatment. Based on the dataset provided by the Obstetrics and Gynecology Hospital of Fudan University, a four-category classification model for cervical lesions including Normal, low-grade squamous intraepithelial lesion (LSIL), high-grade squamous intraepithelial lesion (HSIL) and cancer (Ca) is developed. Considering the dataset characteristics, to fully utilize the research data and ensure the dataset size, the model inputs include original and acetic colposcopy images, lesion segmentation masks, human papillomavirus (HPV), thinprep cytologic test (TCT) and age, but exclude iodine images that have a significant overlap with lesions under acetic images. Firstly, the change information between original and acetic images is introduced by calculating the acetowhite opacity to mine the correlation between the acetowhite thickness and lesion grades. Secondly, the lesion segmentation masks are utilized to introduce prior knowledge of lesion location and shape into the classification model. Lastly, a cross-modal feature fusion module based on the self-attention mechanism is utilized to fuse image information with clinical text information, revealing the features correlation. Based on the dataset used in this study, the proposed model is comprehensively compared with five excellent models over the past three years, demonstrating that the proposed model has superior classification performance and a better balance between performance and complexity. The modules ablation experiments further prove that each proposed improved module can independently improve the model performance.Copyright © 2024 Elsevier Ltd. All rights reserved.